profusion / apollo-federation-file-upload

Add file upload support to Apollo Federated services.
32 stars 27 forks source link

Transfer-encoding chunked #17

Closed Tibiritabara closed 4 years ago

Tibiritabara commented 4 years ago

Dear team,

I'm currently working on the implementation of this service within my federation instance. My federated services are working on python and django, and I'm unable to process the transfer-encoding chunked due to well known limitations on my underlying python libraries. I was wondering if there is a way to avoid chunked requests.

Thank you so much for your help and support

cabelitos commented 4 years ago

Hello, thank for the report. However I have some doubts about what you said. Are you using this library as a file upload alternative? What kind of transfer encoding does your python backend supports, since it does not support chunked encoding. Is this some well known library?

If your services are in python how are you using this library?

Tibiritabara commented 4 years ago

I have no idea, to be honest, I have been going on and on on the WSGI implementation as well as django WSGIRequest implementation. It seems that the lack of content-length just makes the entire framework go nuts and it wont be able to recognize data sent through a request. I just tested the library manually on my end and found out that this block of code:

resolvedFiles.forEach(
      ({ createReadStream, filename, mimetype: contentType }, i: number) => {
        form.append(i.toString(), createReadStream(), {
          contentType,
          filename,
          /*
              Set knownLength to NaN so node-fetch does not set the
              Content-Length header and properly set the enconding
              to chunked.
              https://github.com/form-data/form-data/pull/397#issuecomment-471976669
          */
          // knownLength: Number.NaN,
        });
      },
    );

is the one causing the issues. If my understanding is correct, here we are appending the stream to the form. Now, is there a way to append the object, not the stream here? ... I'm deeply sorry for any caused issues, as I'm struggling a bit with typescript at the moment

Thank you so much in advance

barbieri commented 4 years ago

the way we do is "forward" on the go, so we need chunked: data comes in, we send it out. The other way would be to completely load it in memory and then serialize as one batch, providing the content-length. However this may cause the gateway to go out-of-memory.

@cabelitos that way was the original you did, without the createReadStream, maybe you can isolate the forEach function and have some toggle (parameter, envvar...) to use one or another, so @Tibiritabara can try to other function. Something like:


const createFormChunkedForEachFn = (form: ...): ((item: { createReadStream: ..., filename: string, mimetype: string }), i: number) => void) => {
  return ({ createReadStream: ..., filename: string, mimetype: string }), i: number) => {
    form.append(i.toString(), createReadStream(), {
          contentType,
          filename,
          /*
              Set knownLength to NaN so node-fetch does not set the
              Content-Length header and properly set the enconding
              to chunked.
              https://github.com/form-data/form-data/pull/397#issuecomment-471976669
          */
          // knownLength: Number.NaN,
        });
      };
};

...
resolvedFiles.forEach(isChunked ? createFormChunkedForEachFn(form) : createFormMemoryForEachFn(form))
Tibiritabara commented 4 years ago

Thanks for your support, Im still struggling to load that in memory. In the meanwhile this is a node-fetch issue related to the content-length that causes issues with python WSGI. https://github.com/node-fetch/node-fetch/issues/47

I will keep trying, and Ill also greatly appreaciate any help on this.

cabelitos commented 4 years ago

@barbieri I can add this implementation and see if it helps @Tibiritabara. We could add an argument to the FileUploaderDataSource constructor which enables this kind of logic (it should be off by default). However I think that this kind of behavior should be avoided if possible. I'll add a big remark when implementing this, since it could generate an exploited and exhaust the gateway's memory

Tibiritabara commented 4 years ago

That would be extremely helpful on my scenarios, thank you so much

cabelitos commented 4 years ago

@Tibiritabara please, look at #18

Tibiritabara commented 4 years ago

Tested it and it works like a charm. Im incredibly grateful for your help and support, you guys rock!

cabelitos commented 4 years ago

@Tibiritabara Please use version 2.2.0 which is available on NPM right now. Thank you!