H4ad / serverless-adapter

Run REST APIs and other web applications using your existing Node.js application framework (NestJS, Express, Koa, tRPC, Fastify and many others), on top of AWS, Azure, Huawei and many other clouds.
https://serverless-adapter.viniciusl.com.br/
MIT License
134 stars 8 forks source link

Binary detection (AWS API Gateway) #258

Open Guichaguri opened 2 months ago

Guichaguri commented 2 months ago

Feature Request

Is your feature request related to a problem? Please describe.

Currently, you have to manually specify which content types are considered "binary" in order to use them with the API Gateway. Not specifying them causes the response to be converted into a UTF-8 string. This is responsible for a data loss that corrupts the file, and it's not clear to the user what happened.

Describe the solution you'd like

Considering that encoding text data into binary does not corrupts the data, but encoding binary data into UTF-8 does, I think the library should consider everything as binary by default, and have a list of text content types instead of binary ones.

Encoding text data as base64 is inefficient, but it still works. To work around the inefficiency, the library can have a list of common text mime types by default.

Describe alternatives you've considered

Adding to every common binary mime type the DEFAULT_BINARY_CONTENT_TYPES, although it does not solves the unclear indication of the cause, it softens the problem as it will be faced on fewer cases.

Are you willing to resolve this issue by submitting a Pull Request?

Yes, I have the time, and I know how to start.

H4ad commented 2 months ago

For a minor release, we can add every common binary mime type to DEFAULT_BINARY_CONTENT_TYPES.

Considering that encoding text data into binary does not corrupts the data, but encoding binary data into UTF-8 does, I think the library should consider everything as binary by default, and have a list of text content types instead of binary ones.

We can do this but it will require a major release since I will invert the way I check the binary.

Also, I think we can rewrite the internal code to handle response body as buffer instead of (buffer->string->buffer) every time, this will not be a breaking change (I think) and it will probably speedup this lib.

Well, feel free to open PRs for each suggestion, the first and the last one I can merge and release as minor, but the second one will require a major release, so I will delay it a little bit.