movio / bramble

A federated GraphQL API gateway
https://movio.github.io/bramble/
MIT License
497 stars 55 forks source link

add multipart file upload #268

Open benzolium opened 1 month ago

benzolium commented 1 month ago

Resolves https://github.com/movio/bramble/issues/154

@pkqk this a draft. Please take a look. Any suggestions or remarks?

I'm currently testing it manually. Though I'm using almost the same approach for 6 month on production. 🙂

benzolium commented 1 month ago

There is a problem with logging. My code: https://github.com/movio/bramble/blob/3dc62ad5a8ef4af565af56459bce13ae3d943f36/instrumentation.go#L58 breaks tests in instrumentation_test.go: However, code from the head: https://github.com/movio/bramble/blob/949badb3574efe87f16a1bdde6a27a0136d1babd/instrumentation.go#L55 just logs the whole requests body with log level info. Which is kinda not ok in case of large files (it even caused me some OOMs).

For now I've just reverted my code with changes to log level, but it is an issue to address.

pkqk commented 1 month ago

Hi @benzolium,

This looks cool, I'll give it a test when I have a chance. Is it possible you could add an example server to the examples folder to show it in use?

benzolium commented 1 month ago

@pkqk sure. Done.

I've also added one more test case and updated docs with headers plugin config: to make file upload work we must pass Content-Type headers, so I've updated gateway.json config.

benzolium commented 1 month ago

There’s still a problem with logging the whole body. For large files it will be a big issue.

@pkqk i cannot just remove logging of body completely because there is some instrumentation logic relying on that (there are even tests for having body in logs). So what should I do? Remove body from logging by checking content type header?

pkqk commented 1 month ago

It's probably worth removing the request body from the logging by default, it was in there initially as it was useful during early development.

For now you could stop it logging the body on a multipart/form-data request with something like:

diff --git a/middleware.go b/middleware.go
index a370351..f74f874 100644
--- a/middleware.go
+++ b/middleware.go
@@ -104,15 +104,20 @@ func addRequestBody(e *event, r *http.Request, buf bytes.Buffer) {
    contentType := r.Header.Get("Content-Type")
    e.addField("request.content-type", contentType)

-   if r.Method != http.MethodHead &&
-       r.Method != http.MethodGet &&
-       contentType == "application/json" {
-       var payload interface{}
-       if err := json.Unmarshal(buf.Bytes(), &payload); err == nil {
-           e.addField("request.body", &payload)
-       } else {
+   if r.Method != http.MethodHead && r.Method != http.MethodGet {
+       switch contentType {
+       case "application/json":
+           var payload interface{}
+           if err := json.Unmarshal(buf.Bytes(), &payload); err == nil {
+               e.addField("request.body", &payload)
+           } else {
+               e.addField("request.body", buf.String())
+               e.addField("request.error", err)
+           }
+       case "multipart/form-data":
+           e.addField("request.body", fmt.Sprintf("%d bytes", len(buf.Bytes())))
+       default:
            e.addField("request.body", buf.String())
-           e.addField("request.error", err)
        }
    } else {
        e.addField("request.body", buf.String())
pkqk commented 1 month ago

@benzolium I tried testing the example service but I'm getting these errors

{
  "errors": [
    {
      "message": "unexpected response code: 422 Unprocessable Entity",
      "locations": [
        {
          "line": 2,
          "column": 2
        },
        {
          "line": 3,
          "column": 2
        }
      ],
      "extensions": {
        "selectionSet": "{ uploadGizmoFile(upload: $fileA) uploadGadgetFile(upload: {upload:$fileB}) }"
      }
    }
  ],
  "data": null
}

This is the curl command I'm using

curl --url http://localhost:8082/query \
  --form 'operations={"query":"mutation upload($fileA: Upload!, $fileB: Upload!) {\n\tuploadGizmoFile(upload: $fileA)\n\tuploadGadgetFile(upload: {\n\t\tupload: $fileB\n\t})\n}","variables":{"fileA":null,"fileB":null}}' \
  --form 'map={
 "a": ["variables.fileA"],
 "b": ["variables.fileB"]
}' \
  --form a=@a.txt \
  --form b=@b.txt
benzolium commented 1 month ago

@pkqk good catch. Fixed. I was populating files map incorrectly.

pkqk commented 1 month ago

@benzolium I had a bunch more suggestions that were a bit too big to do through the review system. If you'd like to look at my fork

Also looking through the spec examples, I noticed the current file map approach won't handle arrays of Files, i.e making a file map with variables.fileList.0, variables.fileList.1 etc. So you might need a case to handle []graphql.Upload and []*graphql.Upload in the loop.

benzolium commented 1 month ago

@pkqk thanks for suggestions! I'd definitely apply some of them.

However there are several things I'd like to discuss:

Should we open a PR to my fork and discuss suggestions there? Or maybe I should apply other suggestions and we could discuss remaining here?

As for array files, I'll look into that. Haven't ever used files array so overlooked.

pkqk commented 1 month ago

Hi @benzolium, good questions, performance wise, we are dealing with a lot of json parsing and serialisation which is going to be an order of magnitude slower than any of the checks we do here, so I wouldn't consider optimising these until it proves to be an issue.

In general I'd go for clarity and slightly less efficient code than optimising before it's necessary.

pkqk commented 1 month ago

@benzolium I can either PR into your branch, or you can cherry pick the ones that you agree with and discuss the other ones here.

benzolium commented 1 month ago

Hey @pkqk! Thanks for the answers and for the ping. You can surely PR into my branch.

Sorry for the delay. I’m a little but overwhelmed with work now.

pkqk commented 1 month ago

There's no rush @benzolium, looking forward to merging this in but I want us to get it polished and working well before we do.

benzolium commented 1 month ago

Hey @pkqk! I've merged your suggestions and added a case for list of files.