octet-stream / form-data

Spec-compliant FormData implementation for Node.js
https://www.npmjs.com/package/formdata-node
MIT License
142 stars 17 forks source link

v3 dropped support for stream.Readable #32

Closed falkenhawk closed 3 years ago

falkenhawk commented 3 years ago

Hi, I have troubles using v3 with streams coming from https://github.com/jaydenseric/graphql-upload#type-fileupload . createReadStream() returns an instance of fs-capacitor.ReadStream which extends stream.Readable.

My case is I want to "proxy" files being uploaded through graphql to an external "upload service" which I construct FormData to stream the uploaded files there with axios request, possibly without buffering and putting the whole files in memory first on my "graphql proxy" instance.

  if (isGraphqlUpload(file)) {
    formData.append('file', file.createReadStream(), {
      filename: file.filename,
      type: file.mimetype,
    });
  }

With formdata-node v2 it works, and with v3 an error is thrown: Failed to execute 'append' on 'FormData': parameter 2 is not one of the following types: ReadStream | Buffer | File | Blob which comes from the condition resolving the value to be "file-like" at https://github.com/octet-stream/form-data/blob/master/lib/FormData.ts#L189 and that isReadStream() part is obviously false, because it only accepts instances of fs.ReadStream

Could you perhaps give me a hint how to proceed here? Do I really have to buffer the whole file locally inside my proxy script before pushing it further?

Or would it be possible for you to still support stream.Readable in v3? I know you deliberately removed the support to comply with browser standards, but you are still allowing use of fs.ReadStream so maybe you could consider opening it back up to support stream.Readable interface again... 🙈

Or should I just stick to v2 for now? And from what I've just noticed from looking at the code, I guess the files were fully buffered locally anyway also in v2 (while being converted to instances of File) before sending them? (I thought that they are just streamed, but perhaps it's not possible at all? 🤔 or maybe they were and it's just me reading the code wrong)

octet-stream commented 3 years ago

Hi! I'm glad you asked this. Well, for now you totally need to stick with v2 unless this issue is resolved :D

While I allow fs.ReadStream, it typically comes from fs.createReadStream. When you pass the result of this function to .set() or .append() methods they don't actually use that stream as a field's value, but instead create a File using ReadStream#path property. This File only contain the reference to the one on a disk (I use this function to create such references).

I think the solution would be some sort of a FileStream class which implements the following interface:

interface FileStream {
  type?: number
  size: number
  name: string
  stream(): Readable | ReadableStream
}

This should be enough to meet FormData expectations so that it can treat field as if it is a File. Note that the size and name fields required due to spec and FormData expects them to be presented (size is necessary if you want to use FormData#getComputedLength() or your http client support spec-compatible form-data implementations, because they also need the size for Content-Length header). Do fs-capasitor's ReadStream have the size field?

octet-stream commented 3 years ago

Or should I just stick to v2 for now? And from what I've just noticed from looking at the code, I guess the files were fully buffered locally anyway also in v2 (while being converted to instances of File) before sending them? (I thought that they are just streamed, but perhaps it's not possible at all? 🤔 or maybe they were and it's just me reading the code wrong)

Not sure if I completely understand you, but in v2 ReadStream and Readable stream were always returned as-is, which means FormData won't touch their content until you start to read from FormData#stream, so I believe that if given stream isn't buffered anything before you start to read them, nothing from their content should be in memory.

octet-stream commented 3 years ago

What I'm trying to say is that each FormData instance contains a reference to field's value in memory (not the value itself) if you .set() or .append() a File (when you create it from createReadStream() or fileFromPathSync()), ReadStream, ReadableStream or Readable. It work that way from the beginning.

octet-stream commented 3 years ago

I haven't tried it, but as far as I can tell, this may solve your problem:

class FileStream {
  constructor(stream) {
    this.type = stream.mimetype
    this.name = stream.filename
    this._ref = stream
  }

  stream() {
    return this._ref.createReadStream()
  }
}

const fd = new FormData()

// Create a new FileStream with fs-capasitor stream
fd.append("file", new FileStream(stream))
octet-stream commented 3 years ago

Note that you'll need to tell your http client to consume request body from fd.stream and set headers manually, because if you'll use it with those client which support spec compatible form-data implementations (such as node-fetch starting from v3), things may broke when it will try to calculate a value for Content-Length header because FileStream does not have the size field.

octet-stream commented 3 years ago

I noticed that FileStream also need Symbol.toStringTag getter to meet fetch-blob requirements in Symbol.hasInstance method. So, this should work:

class FileStream {
  constructor(stream) {
    this.type = stream.mimetype
    this.name = stream.filename
    this._ref = stream
  }

  stream() {
    return this._ref.createReadStream()
  }

  get [Symbol.toStringTag]() {
    return "FileStream"
  }
}

See: https://github.com/node-fetch/fetch-blob/blob/3b12a39ea222ff43637a488af560836088fa5753/index.js#L161-L169

falkenhawk commented 3 years ago

Wow, thank you for your amazing support! You are my hero. 🍻

I am still a noob when it comes to streams. Thank you for clearing up my doubts about the memory usage.

Do fs-capasitor's ReadStream have the size field?

I am afraid it does not have it, and neither does graphql-upload - it can't see if it provides any way to retrieve size of files uploaded via graphql, so this may pose a problem 🤔

I will try with your FileStream implementation and passing the stream and headers to axios request as you suggested. Thanks again!

octet-stream commented 3 years ago

Does it help?

falkenhawk commented 3 years ago

@octet-stream I suppose class FileStream was meant to mimic instance of File from your module, right? To make the isFile() check pass. I had to change return 'FileStream' to return 'File' inside Symbol.toStringTag to make it pass.

import { Readable } from 'stream';

class FileStream {
  constructor(private readableStream: Readable) {}

  stream() {
    return this.readableStream;
  }

  get [Symbol.toStringTag]() {
    return 'File';
  }
}

const formData = new FormData();

// `file` is of type { FileUpload } from 'graphql-upload'
formData.append(field, new FileStream(file.createReadStream()), {
  filename: file.filename,
  type: file.mimetype,
});

// send formdata: `client` is of type { AxiosInstance } from 'axios'
client.post(url, formData.stream, { headers: formData.headers })
octet-stream commented 3 years ago

I suppose class FileStream was meant to mimic instance of File from your module, right?

Right.

I had to change return 'FileStream' to return 'File' inside Symbol.toStringTag to make it pass.

Oh, sure. I thought they have regular expression without ^ and $ in it (/(Blob|File)/ not /^(Blob|File)$/)

jimmywarting commented 3 years ago

Just want to swing in and say that fetch-blob is about to drop node streams.

In the upcoming fetch-blob@3 it will just be enough to return a asyncIterable that can yield uint8arrays from calling stream(). it should not affect you so much since streams already is asyncIterable.

octet-stream commented 3 years ago

return a asyncIterable that can yield uint8arrays

Doesn't that mean that I have to convert it back to Buffer since formdata-node internal stream is using Buffer? Or maybe Readable stream will handle it for me?

jimmywarting commented 3 years ago

Whether you yield a Buffer or a Uint8Array dosen't mather. Buffer extends Uint8Array but you should not have to worry so much about it. you can either return a node stream, whatwg stream that have a Symbol.asyncIterator that yields Uint8arrays. or you could have a very simple basic generator function that yields Uint8Arrays

octet-stream commented 3 years ago

Figured out that I'll need to add small changes to adopt fetch-blob@3 release, since I cast everything to a string, except for Buffer here: https://github.com/octet-stream/form-data/blob/ac03591f92a3397265ca5b757ac0483f53ba59f3/lib/FormData.ts#L467

octet-stream commented 3 years ago

you are still allowing use of fs.ReadStream

I decided to deprecate it as well in favour of fileFromPathSync and the upcoming fileFromPath :D ReadStream as field's value will be removed in 4.x. Also, I will definitely drop CJS in next major release.

octet-stream commented 3 years ago

I'm gonna close this because of no response. Hope my recommendation solves your problem. The issue will remain pinned, so if someone had the same problem, they'll find the solution here.

gregory commented 3 years ago

@octet-stream I used to pass a https://github.com/sindresorhus/got#streams stream. With your new implementation this is now impossible as you made the assumption that people would always pass a file. I'm now stuck with an earlier version of your lib :( Would it be easier to keep this lib backward compatible or maintain patches on all versions? Or just fork it if this breaking change was intentional?

jimmywarting commented 3 years ago

@gregory As it has been mention earlier, you can always create a file like object that returns the stream

But you also have to know how large the stream is it works the same way with my formdata-polyfill

const formData = new FormData()

formData.append('upload', {
  name: 'hello.txt',
  size: 3,
  type: '',
  stream() { return stream },
  [Symbol.toStringTag]: 'File'
})
octet-stream commented 3 years ago

But you also have to know how large the stream is

You don't have to, as long as you use FormData#stream + FormData#headers to send the content.

@gregory This change was made for compatibility with browser implementations. FormData append and set methods can only take File and Blob object, everything else will be converted to string. As I mentioned above, you can make a small wrapper around fs-capasitor streams to bypass this restriction.