Closed mbsimonovic closed 5 years ago
function getSchemaForMediaType(contentType:'multipart/form-data; boundary=--------------------------480295105673604652952056', requestBodySpec='content: multipart/form-data') {
seems to be the problem:
@mbsimonovic please submit a PR to resolve this issue.
@jsdevel not sure now to setup and run tests, but the simplest fix could look like this (line 313):
if (contentType.startsWith('multipart/form-data')
&& mediaTypeKey.startsWith('multipart/form-data')) {
return mediaTypeKey;
}
another option is to use a module, https://github.com/jshttp/content-type for example:
const contentType = require('content-type');
contentType.parse('multipart/form-data; boundary=----WebKitFormBoundaryWyK9kAU7d35AKf26').type == 'multipart/form-data';
contentType.parse('multipart/form-data').type == 'multipart/form-data';
contentType.parse('image/svg').type == 'image/svg';
contentType.parse('text/html; charset=utf-8; foo=bar
').type == 'text/html';
@mbsimonovic see the README for development help
btw shouldn't this work in typescript when importing an untyped js module:
import contentTypeParser from 'content-type'; //doesn't work
// const contentTypeParser = require('content-type'); // works fine
Depends on the module you want to import. Does it export default es6 syntax?
I seem to be running into issues with OpenApi3 with multipart/form-data
. I created an express-openapi
test (that current fails) here: https://github.com/mendelk/open-api/tree/openapi3-multipart-form-consumes-middleware.
Before I create an issue, can someone confirm that this is actually a bug, or maybe I'm doing something wrong?
@jsdevel @mbsimonovic
Thank you!
i'll defer to @mbsimonovic
Where and why exactly does it fail? The test just says bad request. We need logging!! :)
the error says:
{
message: "cannot POST /v3/consumes (400)",
type: "text/html",
text: `<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="utf-8">
<title>Error</title>
</head>
<body>
<pre>[object Object]</pre>
</body>
</html>
`
ideally i would now just turn on the debug level for openapi and see what's going on.
@mbsimonovic I'll paste the exact error once I get to my desk, but a 400
error is a validation error. If you log the error (say, using middleware) you'll see the error is due to the spec expecting a string, and instead an object (representing the file) is passed. The bug is that a string of format binary should validate against the multer
object, correct?
Thanks @mbsimonovic for your help with this!
(As an aside, the <pre>[object Object]</pre>
also gets logged to stdout
and seems to be a bug somewhere...)
i somehow got multipart working for me, so perhaps i could strip it down and give it to you so you have a working version?
That would be great! (My example is pretty stripped down, but a working starting point would be awesome.)
here you go. so npm install
first, then npm run test
.
Thanks @mbsimonovic .
In your example, changing the spec like so:
multipart/form-data:
schema:
required:
- files
type: object
properties:
files:
type: array
items:
type: string
format: binary
(Note the required
property.)
This causes the following error:
errors: [
{
"path": "files",
"errorCode": "required.openapi.validation",
"message": "should have required property 'files'",
"location": "body"
}
]
This, I believe, is the crux of our issue!
Edit, to explain further: The reason your spec is validating is because the files
property is in fact not being validated, because it's an optional property. From the validator's point of view, that property doesn't exist to validate. Making it a required field changes all that. Now you'll need to put files
on the req.body
before x-express-openapi-additional-middleware
, i.e. via args.consumesMiddleware
. Once that's done, you'll get the same error I've been getting all along:
errors: [
{
"path": "files[0]",
"errorCode": "type.openapi.validation",
"message": "should be string",
"location": "body"
}
]
After spending some time on this, this would require quite a few changes to get working, and I'm not sure it's in scope of this library. Other libraries handle this issue by skipping validation on format: binary
, but not sure that's any better. Ultimately, the problem can be solved by setting body[files][0]
to an arbitrary (or empty) string, and getting the file via req.files
. Is it the prettiest solution? I don't think so. But it works...
@mendelk can you open a PR to update the applicable README?
@jsdevel Sure! But as I said, not sure my solution is Right Way™ to do this. If you still want me to, I'll be happy make it.
@mendelk i feel a clear example of how to get files working would be beneficial. it doesn't have to be branded as the right/only way to handle it, but A Way ™
i've got 2 tests failing on the master at the moment:
1) openapi-request-validator
should fail bad media type for request body:
TypeError: invalid media type
at Object.parse (node_modules/content-type/index.js:126:11)
at getSchemaForMediaType (index.ts:2:79)
at OpenAPIRequestValidator.validate (index.ts:1:28)
at Context.<anonymous> (test/data-driven.ts:22:31)
2) openapi-request-validator
should handle missing content type as missing body:
TypeError: argument string is required
at Object.parse (node_modules/content-type/index.js:108:11)
at getSchemaForMediaType (index.ts:2:79)
at OpenAPIRequestValidator.validate (index.ts:1:28)
at Context.<anonymous> (test/data-driven.ts:22:31)
If i understand this correctly, after multer parses the request, it attaches files
on req
which we should put on the body
before validation runs:
req.body.files = [{
"fieldname": "files",
"originalname": "stock.jpeg",
"encoding": "7bit",
"mimetype": "image/jpeg",
"destination": "/tmp/uploads",
"filename": "8456802f71781abe34a1da8cca4f8244",
"path": "/tmp/uploads/8456802f71781abe34a1da8cca4f8244",
"size": 100367
}];
The apidoc says that files
array should have 2 props called type
and format
:
files: {
type: 'array',
items: { type: 'string', format: 'binary' }
}
but there're no such properties, so that's why it fails?
Btw the proposed solution:
multer.any()(req, res, (err) => {
if (err) {
return next(err);
}
// Handle both single and multiple files
const filesMap = req.files.reduce(
(acc, f) =>
Object.assign(acc, {
[f.fieldname]: (acc[f.fieldname] || []).concat(f)
}),
{}
);
Object.keys(filesMap).forEach((fieldname) => {
const files = filesMap[fieldname];
req.body[fieldname] = files.length > 1 ? files.map(() => '') : ''; // BUG
});
return next();
});
Doesn't validate when schema defines files
as an array, but only one file is uploaded. files.length
is 1 then so req.body.files == ''
, but the schema says it should be an array.
nah, seems like this isn't the case, files[0] should be a string
makes no sense. i've opened a PR that nicely reproduces this in request validator https://github.com/kogosoftwarellc/open-api/pull/329
here's how the generated validator looks like:
function validateBody(data, dataPath, parentData, parentDataProperty, rootData) {
'use strict';
var vErrors = null;
var errors = 0;
if ((data && typeof data === 'object' && !Array.isArray(data))) {
var errs__0 = errors;
var valid1 = true;
var data1 = data.body;
if (data1 !== undefined) {
var errs_1 = errors;
if ((data1 && typeof data1 === 'object' && !Array.isArray(data1))) {
var errs__1 = errors;
var valid2 = true;
var data2 = data1.files;
if (data2 === undefined) {
valid2 = false;
var err = {
keyword: 'required',
dataPath: (dataPath || '') + '.body',
schemaPath: '#/properties/body/required',
params: { missingProperty: 'files' },
message: 'should have required property \'files\''
};
if (vErrors === null) vErrors = [err]; else vErrors.push(err);
errors++;
} else {
var errs_2 = errors;
if (Array.isArray(data2)) {
var errs__2 = errors;
var valid2;
for (var i2 = 0; i2 < data2.length; i2++) {
var errs_3 = errors;
if (!(typeof data2[i2] === 'string')) { // FAILS HERE: data2[i2] (body.files) is an array
var err = {
keyword: 'type',
dataPath: (dataPath || '') + '.body.files[' + i2 + ']',
schemaPath: '#/properties/body/properties/files/items/type',
params: { type: 'string' },
message: 'should be string'
};
if (vErrors === null) vErrors = [err]; else vErrors.push(err);
errors++;
}
var valid3 = errors === errs_3;
}
} else {
var err = {
keyword: 'type',
dataPath: (dataPath || '') + '.body.files',
schemaPath: '#/properties/body/properties/files/type',
params: { type: 'array' },
message: 'should be array'
};
if (vErrors === null) vErrors = [err]; else vErrors.push(err);
errors++;
}
var valid2 = errors === errs_2;
}
} else {
var err = {
keyword: 'type',
dataPath: (dataPath || '') + '.body',
schemaPath: '#/properties/body/type',
params: { type: 'object' },
message: 'should be object'
};
if (vErrors === null) vErrors = [err]; else vErrors.push(err);
errors++;
}
var valid1 = errors === errs_1;
}
}
validate.errors = vErrors;
return errors === 0;
}
const body = {
'files': [
{
'fieldname': 'files',
'originalname': 'stock.jpeg',
'encoding': '7bit',
'mimetype': 'image/jpeg',
'destination': '/tmp/uploads',
'filename': '8456802f71781abe34a1da8cca4f8244',
'path': '/tmp/uploads/8456802f71781abe34a1da8cca4f8244',
'size': 100367
}
]
};
validateBody({body}); // fails with 'body.files[0] (type) should be a string
we should create a new issue for this
hm, could this be a case of a badly defined apidoc https://swagger.io/docs/specification/data-models/data-types/#array ?
type
and items are reserved words, so seems like type='string'
is being interpreted as "files array should have objects of type string (in binary format)`? Kind of makes sense, files should be an array of files represented as strings (in binary format), not some objects.
when multipart schema is defined like this then it passes validation (w/ required
):
'multipart/form-data': {
schema: {
type: 'object',
required: ['files'],
properties: {
files: {
type: 'array',
items: {
type: 'object',
properties: {
fieldname: { type: 'string' },
encoding: { type: 'string' },
mimetype: { type: 'string' },
size: { type: 'integer' },
destination: { type: 'string' },
filename: { type: 'string' },
path: { type: 'string' }
}
}
}
}
}
}
this makes more sense to me, to properly define the schema according to what the backend expects, using multer in this case, then to do some hacks with custom middleware to actually disable validation. so either don't use required and disable validation, or properly define request body. What do you guys say?
@mbsimonovic
The apidoc says that
files
array should have 2 props calledtype
andformat
Not quite accurate. The items in the files
array have no props at all, because they're strings. The format
has no effect at all (unless you define a customFormat
).
Doesn't validate when schema defines
files
as an array, but only one file is uploaded.files.length
is 1 then soreq.body.files == ''
, but the schema says it should be an array
See my extended example linked in README for code that handles both single files and arrays.
this makes more sense to me, to properly define the schema according to what the backend expects, using multer in this case
The problem is that that the spec expects type: string, format: binary
for strings (as do all the tools that tie in, such as Swagger UI etc). The fact that Multer is being used is not connected to the spec at all, so I'm not sure it makes sense to change the validator to expect a Multer object when the spec calls for a type: string, format: binary
.
so either don't use required and disable validation
I don't think it makes sense to change your spec to satisfy your validator. (In fact , you don't always have that option, when the spec isn't defined by you.)
Let's step back. How am I supposed to write the spec for multiple files upload? I thought this is correct, no:
requestBody: {
description: 'a multipart',
content: {
'multipart/form-data': {
schema: {
type: 'object',
// required: ['files'], // fails #279
properties: {
files: {
type: 'array',
items: { type: 'string', format: 'binary' }
}
}
}
}
}
}
That expects req.body.files
to be an array of strings (in binary). In this case it's perfectly valid to upload a single file, the array files will have one element. In this case your example fails because req.body.files will be an empty string, instead of an array having one empty string.
Correct. My example was for a single file.
'multipart/form-data': {
schema: {
type: 'object',
required: ['file'],
properties: {
file: {
type: 'string',
format: 'binary'
}
}
}
}
To be 100% accurate, you can't do this without inspecting the apiDoc
for this route because you don't know if it should be an array or a single file. In my example, I make the assumption that if there are multiple files with the same fieldname
, we have an array. But that can fail in the real world, if for example your spec specifies an array but a single file happened to be passed...
just to double check, with consumesMiddleware
defined, does it execute on all requests, not just some custom paths that should handle uploads? This is a security problem:
WARNING: Make sure that you always handle the files that a user uploads. Never add multer as a global middleware since a malicious user could upload files to a route that you didn't anticipate.
Only use this function on routes where you are handling the uploaded files.
I think it does execute on all routes, but not quite sure what the security problem is. Can you explain?
i'm not sure either, but sounds like something like this:
can't think of an exploit from the top of my head, but the fact that it runs where it shouldn't is a concern :)
Yeah I'm not seeing the problem but then again I'm no security expert 😀.
I think a strong argument can be made that Multer should be an official dependency of express-openapi
and that the library would handle all multipart file uploads, only on the specified routes based on the spec etc.
What are your thoughts @jsdevel?
@mendelk there's pros and cons: Pros:
i'm so happy i don't have to make this decision :)
I agree with your list, however, considering that Multer is the official way to handle file uploads (according to express docs), maybe the con isn't too bad!
i'm getting an error trying to handle file uploads with v3.6.0. The spec is:
express-openapi is initialised as suggested:
Hitting the app with curl or swagger-ui:
curl "http://localhost:3000/upload/" -H "accept: */*" -H "Content-Type: multipart/form-data" -F docs=@package.json
fails with the following error (the endpoint never gets called):The same happens with superagent/supertest and postman. Any hints?