graphql / graphql-over-http

Working draft of "GraphQL over HTTP" specification
https://graphql.github.io/graphql-over-http
MIT License
387 stars 60 forks source link

[2023-11] Add notes about security to GraphQL-over-HTTP spec #280

Open benjie opened 12 months ago

benjie commented 12 months ago

@martinbonnin and @glasser at Apollo were discussing CSRF, timing attacks, etc. Benjie feels that general HTTP concerns (security, rate limiting, cookies, etc etc) are concerns outside of the GraphQL-over-HTTP's spec, but Lee suggests that in the "art rather than science" vein we should have a non-normative section on how to think about security - handing off to follow best guidance on HTTP/internet security; but we should also add GraphQL specific notes - especially "this is secure because we omitted it".

(NOTE: @leebyron said "non-conformance" and "non-compliance", but I believe he meant "non-normative". Lee, please correct me if I misunderstood you.)


Note: Action Item issues are reviewed and closed during Working Group meetings.

benjie commented 12 months ago

@martinbonnin and/or @glasser would you care to elaborate your concerns and/or submit non-normative notes to the GraphQL-over-HTTP spec regarding this. I'm happy to do editorial on them if you only have time for rough notes, I just want to ensure I'm capturing the important parts.

martinbonnin commented 12 months ago

Thanks for following up on this!

I'll defer to @glasser for the details but my high level understanding is that some conditions make GraphQL requests more prone to CSRF issues like the ones described in this blog post:

  1. POSTs with a multipart/form-data content-type (which can be accepted by some middleware such as file uploads) could modify state bypassing CORS because multipart/form-data doesn't require a preflight request.
  2. GETs could be used for timing attacks, which are effectively the same concern as with any REST API but GraphQL make them a bit easier because of the dynamic nature of the query making it harder for the backend to reply in constant time.

For 1., might be worth requiring a content-type around here? For 2. maybe a "security" section towards the end of the document?

benjie commented 12 months ago

Moving this to the GraphQL-over-HTTP WG

Shane32 commented 12 months ago

I'm all for adding important security notes or references when appropriate to the spec as a note.

As it particularly relates to multipart/form-data, we should be sure not to strictly prevent it. As others have noted, bypassing CORS isn't necessarily a security risk, and changes could be made to require a preflight request and CORS validation while still using multipart/form-data.

benjie commented 9 months ago

@glasser Any interest in raising a PR for this?

glasser commented 8 months ago

Sorry for taking so long to get back to this. Yes, I think it would be reasonable for me to write a PR for this.

The key bit to me is that CSRF has become a much simpler problem to avoid in the modern REST/JSON API world, because API endpoints that require you to pass Content-Type: application/json for any request with side effects are automatically protected against CSRF. In fact, developers these days often don't even have to learn about CSRF as long as all the APIs they create have that property.

Because of this, my first suggestion will be that we change the POST section

If the client does not supply a Content-Type header with a POST request, the server SHOULD reject the request using the appropriate 4xx status code.

from a SHOULD to a MUST. Do we know of any real GraphQL servers that don't already obey this?

Secondly, I would add a non-normative section noting that GraphQL servers are generally immune to CSRF attacks, because side effects should only be accessible through mutations, mutations can only be executed by POSTs, and POST requests MUST have an appropriate Content-Type header. However, if a GraphQL server chooses to allow POSTs with non-officially-recognized Content-Type headers, and specifically the Content-Type of multipart/form-data (or text/plain or application/x-www-form-urlencoded), then suddenly the immunity to CSRF attacks will vanish. If immunity from CSRF attacks matters (eg, you have a web client that uses cookies or Basic Auth, or your web server is not accessible on the public internet and gives extra trust to any client that can contact it), you should avoid allowing those Content-Types or provide an alternative CSRF prevention method such as requiring all such requests to have a require-preflight header.

Thirdly, I would consider noting that even allowing read-only GET requests from web browser clients without CORS preflighting can be problematic, due to timing attacks. It may be reasonable for most REST GET APIs to be accessible without a CORS preflight, because the only information code from a non-allowed origin can learn is how long the API took, and it's often not too hard to make the response time of a normal REST GET API relatively constant. But GraphQL queries are very flexible, and it's quite possible to design a query that can use timing to determine whether a given field returns null or not, by nesting under that field a complex set of recursive field selections that are only evaluated if the field returns non-null. So a non-normative suggestion that servers that use cookies/Basic Auth or are inaccessible from the public internet consider also requiring that GET operations contain a content-type header or another specific header such as require-preflight would be reasonable.

I wish in retrospect that we had chosen graphql-require-preflight instead of apollo-require-preflight for the Apollo Router/Server CSRF prevention feature so it would be more reasonable to add as a non-normative suggestion here. The DGS framework allows clients to pass either graphql-require-preflight or apollo-require-preflight; perhaps the spec should suggest the former? I think it wouldn't be too hard to get the Apollo implementations to support both headers too.

Other existing implementations include the graphql-yoga implementation which uses x-graphql-yoga-csrf by default.

I see that drupal-graphql also supports apollo-require-preflight and x-graphql-yoga-csrf. And Pioneer supports apollo-require-preflight.

So concretely I think my changes would be the SHOULD to MUST change above, and a non-normative suggestion that servers should consider rejecting all requests that do not contain a content-type whose essence is a value other than text/plain, multipart/form-data, or application/x-www-form-urlencoded, unless that request also contains one of a set of other header names, where that set should contain graphql-require-preflight.

Should I do this as two separate PRs (separating the SHOULD/MUST from the rest) or just as one?

benjie commented 8 months ago

Awesome write-up; this all sounds great to me, except the SHOULD > MUST - I feel it's okay to say SHOULD with a non-normative note stating that not implementing the behavior would open you up to potential security issues if you are dealing with browser-based clients, and thus you should X, Y and Z. If you feel strongly this isn't an appropriate way to deal with the issue then please raise it as a separate PR to enable discussion. (For clients that don't send Content-Type: application/json or similar: command line clients are the most common, also basic HTTP clients that people write manually. Changing to MUST would force servers to become incompatible with these clients, which should have a very high bar.)

Standardizing graphql-require-preflight sounds like a great call since it's already used fairly widely (or has similar alternatives) - I suggest raising that as a PR on its own for discussion.

glasser commented 8 months ago

Are there examples of popular GraphQL servers that accept POSTs with JSON bodies with no content-type header? The whole JS body-parser-based ecosystem does not, for example. I have definitely had to type -H 'content-type: application/json' many times to curl in my day (though with newer curl you can just use --json instead of --body!)

benjie commented 5 months ago

Are there examples of popular GraphQL servers that accept POSTs with JSON bodies with no content-type header?

The GitHub API:

curl \
  -H 'Authorization: Bearer '$GITHUB_TOKEN \
  -X POST \
  https://api.github.com/graphql \
  --data-binary '{"query":"{ organization(login:\"graphile\") { repository(name:\"crystal\") { description } } }"}'
glasser commented 5 months ago

Hmm, but that API basically requires an Authorization header, right?

benjie commented 5 months ago

Without one it seems to have a rate limit of zero currently.