jaydenseric / graphql-multipart-request-spec

A spec for GraphQL multipart form requests (file uploads).
994 stars 54 forks source link

Reasons for multipart/form-data in place of multipart/related #10

Closed heruan closed 6 years ago

heruan commented 6 years ago

What are the reasons for the spec to be based on multipart/form-data? In the context of GraphQL multipart/related could be a better fit (query + related files).

jaydenseric commented 6 years ago

Interesting. I'm not familiar with multipart/related, what is it and how do you see it being a better fit exactly? Is it universally supported by browser APIs and servers?

The primary use case for this spec is for clients to use the browser fetch API to upload files to a GraphQL server via mutations. The fetch API automatically encodes a multipart/form-data request with the right headers when the payload is a FormData instance. This is the standard web approach to file uploads, and there are a lot of off the shelf packages available for parsing it, etc.

jaydenseric commented 6 years ago

This question has been answered, but feel free to continue the conversation 🙏

heruan commented 5 years ago

Sorry for the delay! Yeah of course, multipart/form-data is widely supported by browsers natively; multipart/related is not. Still, it would fit better for the purpose:

A multipart/related is used to indicate that each message part is a component of an aggregate whole. It is for compound objects consisting of several inter-related components - proper display cannot be achieved by individually displaying the constituent parts. The message consists of a root part (by default, the first) which reference other parts inline, which may in turn reference other parts. Message parts are commonly referenced by the "Content-ID" part header. The syntax of a reference is unspecified and is instead dictated by the encoding or protocol used in the part.

One common usage of this subtype is to send a web page complete with images in a single message. The root part would contain the HTML document, and use image tags to reference images stored in the latter parts.

Defined in RFC 2387

(From: https://en.wikipedia.org/wiki/MIME#Related)

So I would imagine this multipart/related request for a GraphQL query:

Content-Type: multipart/related; boundary=example-2

--example-2
Content-Type: application/graphql; charset=utf-8

{
  "query": "mutation ($file: Upload!) { singleUpload(file: $file) { id } }",
  "variables": { "file": "cid:1234" }
}
--example-2
Content-Type: image/jpeg
Content-ID: <1234>

[encoded jpeg image]
--example-2--

This way there is no need of a map part, since multipart/related already provides a way (the Content-ID) to relate each part to the "root" part.

jaydenseric commented 5 years ago

multipart/related not working easily with browser APIs such as fetch is a dealbreaker. But even if it was easy, other than semantics, I don't see how it enables anything new from multipart/form-data. You still have parts that stream in sequentially.

The suggestion for removing the map field to use file part ID references in the operation instead equally applies to multipart/form-data, and equally doesn't make sense as far as I can tell. See https://github.com/jaydenseric/graphql-multipart-request-spec/issues/11#issuecomment-388230423.

There are 2 ways your suggestion could be implemented on the server: Easy and hard.

The easy way is to store the operation when it is streamed in - it must be the first part. Then for each file part encountered, for an unknown number of files, recurse the variables looking for it's Content-ID reference to populate the file in place. Problem with this is, the resolvers can't be run until the last file has been streamed in. The files have to be put somewhere in the meantime, so this would only really work for an API that stores temp files, and once uploading is complete, passes filesystem paths as the Upload scalar values to resolvers. You can't have an API like graphql-upload where the resolvers receive file upload streams. Also, multiple recursions instead of just 1 is undesirable.

The hard, but better performing way, is to store the operation when it is streamed in - again, it must be the first part. Then recurse the variables, looking for field string values that regex match some sort of convention for a Content-ID string, in order to construct a map, much like the one in the current spec. That is fragile, what if someone happens to have a string value beginning with cid:, or whatever convention is chosen? Hopefully the recursion to create the map happens fast enough that it is ready before the first file is received (just thinking out loud, maybe it's not an issue). The implementation would need to be able to account for the situation where the public might maliciously use cid: in the beginning of string values for mutations to do with making posts, changing their name, etc. Although, any implementation should be prepared for the event of malformed requests with files never arriving anyway.

I feel like there are also other factors I'm forgetting.