thephpleague / openapi-psr7-validator

It validates PSR-7 messages (HTTP request/response) against OpenAPI specifications
MIT License
529 stars 93 forks source link

JSON object not accepted as part of multipart payload #234

Open dmjohnsson23 opened 1 month ago

dmjohnsson23 commented 1 month ago

The validator does not appear to accept JSON objects as "properties" of a multipart payload. For example, in this trimmed-down schema:

openapi: 3.1.0
paths:
  /intake:
    post:
      requestBody:
        content:
          application/json:
            schema:
              $ref: '#/components/schemas/Intake'
          multipart/form-data:
            schema:
              type: object
              required: 
                - data
              properties:
                data:
                  $ref: '#/components/schemas/Intake'
              patternProperties:
                "^doc_.*$":
                  type: string
                  format: binary
                  description: |-
                    This is a document that will be attached to the intake request. You may 
                    provide additional metadata about this document in the `documents` property of 
                    the intake object.
        required: true

When sending an application/json request, the validator accepts the request:

curl -X POST https://dev-server/api/intake -H "Authorization: Bearer $token" -H "Content-Type: application/json" -d '{...data here...}'

However, if I wrap the exact same data in a multipart/form-data payload, it fails by throwing a \League\OpenAPIValidation\Schema\Exception\TypeMismatch exception on the data property with the message "Value expected to be 'object', but 'string' given."

curl -X POST https://dev-server/api/intake -H "Authorization: Bearer $token" -F 'data={...data here...};type=application/json'

According to this link, I should be able to have a JSON part in a multipart payload by specifying type: object. The 3.1 spec also states that JSON should be the default encoding for "object" properties in a multipart payload. The validator, however, seems to only interpret the content as a string.

dmjohnsson23 commented 1 month ago

I spent some time digging around in the source code for the validator. I did find this interesting discrepancy:

https://github.com/thephpleague/openapi-psr7-validator/blob/a665e220d0bba68411efc1899ed5022fd9b10113/src/PSR7/Validators/BodyValidator/MultipartValidator.php#L99-L104

VS

https://github.com/thephpleague/openapi-psr7-validator/blob/a665e220d0bba68411efc1899ed5022fd9b10113/src/PSR7/Validators/BodyValidator/MultipartValidator.php#L297-L301

The former contains this line to parse the body before validating, where the latter does not:

$body = $this->deserializeBody($this->parseMultipartData($addr, $document), $schema);

I tried copying that line from the validatePlainBodyMultipart function to the validateServerRequestMultipart function. It didn't work, and I don't understand the validator's internal workings to chase down exactly what it's doing and why, but I thought I'd at least post it in case it was a useful lead. It is a ServerRequestInterface that I'm validating, via the RoutedServerRequestValidator, so it at least seemed probable that it was related.

dmjohnsson23 commented 1 month ago

It looks like after this block:

https://github.com/thephpleague/openapi-psr7-validator/blob/a665e220d0bba68411efc1899ed5022fd9b10113/src/PSR7/Validators/BodyValidator/MultipartValidator.php#L290-L294

$body contains an array of strings:

array (size=2)
  'data' => string '...truncated...' (length=5597)
  'doc_1' => string '~~~binary~~~' (length=12)

Something needs to happen to parse the JSON here before validation. If I can figure this out, I'll submit a PR, but I may have to defer to someone who knows the codebase if I can't. Advice would also be appreciated.

dmjohnsson23 commented 1 month ago

I've found if I change this:

https://github.com/thephpleague/openapi-psr7-validator/blob/a665e220d0bba68411efc1899ed5022fd9b10113/src/PSR7/Validators/BodyValidator/MultipartValidator.php#L297-L301

To this:

        try {
            $body = $this->deserializeBody($body, $schema);
            $validator->validate($body, $schema);
        } catch (SchemaMismatch $e) {
            throw InvalidBody::becauseBodyDoesNotMatchSchema($this->contentType, $addr, $e);
        }

And also change this:

https://github.com/thephpleague/openapi-psr7-validator/blob/a665e220d0bba68411efc1899ed5022fd9b10113/src/PSR7/Validators/BodyValidator/BodyDeserialization.php#L32

To this:

            $param           = new SerializedParameter($propSchema, 'application/json');

Then the request is validatated properly (well, at least, assuming I only have JSON data in the request obviously...). The only remaining question is how to actually get the expected content type rather than hard-coding it. The existing detectEncodingContentTypes method seems to already implement the required logic, but requires other data (namely, the part's Content-Type header) to call. Frankly, I have no clue how I can actually get that content type header, as ServerReqestInterface doesn't seem to expose it....