node-formidable / formidable

The most used, flexible, fast and streaming parser for multipart form data. Supports uploading to serverless environments, AWS S3, Azure, GCP or the filesystem. Used in production.
MIT License
7.01k stars 681 forks source link

Looking for guidance parsing one part of a multipart as raw data even though mime-type is tsv #570

Open trevtrich opened 4 years ago

trevtrich commented 4 years ago

We are currently running into an issue where we have a multipart-form request in which one part is standard json mime-type and the remaining parts are text/tab-separate-values but are encoded in a unique windows encoding. If we send these through the multipart parser, it looks like they will be read as 'utf-8' which loses certain characters our customers need in their data. Is there any sort of handling for a case like this? Appreciate any help!

trevtrich commented 4 years ago

And of course after digging for some time and finding nothing, I file an issue then shortly thereafter run into https://github.com/node-formidable/node-formidable/issues/127. It seems like this should solve our issue. If that's correct, we can go ahead and close this.

trevtrich commented 4 years ago

In doing a little more research here, it looks like the use of Content-Transfer-Encoding built in the work referenced in the issue above is now deprecated: https://tools.ietf.org/id/draft-ietf-appsawg-multipart-form-data-04.html#rfc.section.5.8. Any suggestion for how to handle this sort of thing moving forward?

tunnckoCore commented 4 years ago

Hi there! I'm not very familiar all that. It may be deprecated, but to verify, is it working with the current master (try install formidable@canary)?

If it's working, great! :tada: And I don't have ideas for how handle it in future when it's removed.

maybe /cc @GrosSacASac

tunnckoCore commented 4 years ago

I found a very recent question in SO, https://stackoverflow.com/questions/7285372/is-content-transfer-encoding-an-http-header

From there, for such things should be used Transfer-Encoding and Content-Encoding? Soo.. probably we don't support it currently.

trevtrich commented 4 years ago

thanks for the input! i was admittedly only at the research stage trying to verify if this would work for our problem, so haven't integrated into our system quite yet.

as i dig more i'm a little disappointed in the browser support for even setting single form "part" header properties for input type=file, so that is still going to be an issue for us even once we do hash through this decision from the parsing/server side. either way, i wanted to make sure i wasn't crazy in thinking the specs for this are relatively grey as of right now. if i come across any good information on the topic i'll be sure to post back here. thanks again for the input! you're free to close for now if you'd prefer until we learn of a "blessed" way to handle this.

if anyone comes across this in the future, right now our solution is to modify the input form on submission and override the mime-type of these files as octet-stream so it won't be parsed as utf-8 server-side. best we've come up with thus far.