hasura / graphql-engine

Blazing fast, instant realtime GraphQL APIs on your DB with fine grained access control, also trigger webhooks on database events.
https://hasura.io
Apache License 2.0
31.1k stars 2.77k forks source link

Feature Request: Rate limiting and Scoring #2151

Open ozum opened 5 years ago

ozum commented 5 years ago

Would like to have rate limiting (# per user per second) and scoring of query cost (score per user per minute etc.).

Below points mentioned in https://blog.apollographql.com/securing-your-graphql-api-from-malicious-queries-16130a324a6b seems to nice to have in Hasura.

Kind Regards,

kriswep commented 5 years ago

I'd second this.

You could easily imagine malicious queries taking down APIs powered by Hasura. For example, running the following query against the todo learning API takes some time, and you could easily extend the nesting there:

{
  todos {
    user {
      todos {
        id
        user {
          id
          todos {
            id
          }
        }
      }
    }
  }
}

For the record, there was some discussion around this in #346, #989 and #1283 but they don't seem to have lead anywhere.

coco98 commented 5 years ago

@kriswep Allowlists (#989) is ready for review and testing and should land soon. :) https://github.com/hasura/graphql-engine/pull/2075

The rest are fairly complicated and we're coming to it gradually!

kriswep commented 5 years ago

That's good to hear, haven't seen that before. But there are uses cases which query whitelisting doesn't fulfill (having a public facing API with unknown clients). Guess I hope other options like depth-limitng and costs analysis won't be forgetten.

ozum commented 5 years ago

The rest are fairly complicated and we're coming to it gradually!

I agree. Maybe starting with depth limiting would be easier and have relatively great impact on stopping malicious queries.

ptrobert commented 5 years ago

Wish List For rate limiting it will be good to have a per user per min/hour/ day/ month limits. A list of blocked users has to be maintained and ability to unblock them. Support for remote schemas as well

mfdeux commented 5 years ago

@ptrobert Short-term, most of those could be solved using the webhook auth method.

txchen commented 5 years ago

@mfdeux currently hasura is not forwarding the graphql query name/query details. If I want to do rate limit on insertions to a certain collection/table, how can we do it with webhook?

mfdeux commented 5 years ago

Ok, I understand. A short term hack I’m using is running through a proxy and parsing the query (using JS or Go), and then implementing rate limiting based on that, but obviously not a long term soution.

On Sat, Aug 3, 2019 at 21:25 Tianxiang Chen notifications@github.com wrote:

@mfdeux https://github.com/mfdeux currently hasura is not forwarding the graphql query name/query details. If I want to do rate limit on insertions to a certain collection/table, how can we do it with webhook?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/hasura/graphql-engine/issues/2151?email_source=notifications&email_token=AEZLJ2I6FWS4YSGVCUVXGULQCYVYVA5CNFSM4HL2LKZ2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD3PYTBI#issuecomment-517966213, or mute the thread https://github.com/notifications/unsubscribe-auth/AEZLJ2OFJZMKZFNSHTAE623QCYVYVANCNFSM4HL2LKZQ .

txchen commented 5 years ago

Yes, that can be a solution, but not very easy to use :) If hasura can forward more context info to webhook, and extend the protocol, I think rate limiting / recaptcha can be done easily.

ptrobert commented 5 years ago

opened a feature request for passing hasura query/mutation into webhook auth post body https://github.com/hasura/graphql-engine/issues/2666

beepsoft commented 5 years ago

I found this, it may provide some good ideas for complexity analysis:

https://github.com/slicknode/graphql-query-complexity

ozum commented 5 years ago

I found this, it may provide some good ideas for complexity analysis:

https://github.com/slicknode/graphql-query-complexity

@coco98, what @beepsoft posted could help.

bitjson commented 4 years ago

Has anyone tried to pass through the Postgres EXPLAIN "cost" estimate for this?

Even if it's off by orders of magnitude, being able to configure a HASURA_GRAPHQL_COST_LIMIT=100000 cap could prevent the most pathological queries for my use case.

tsaiDavid commented 4 years ago

This would be so valuable for us!

@dmi3y

beepsoft commented 4 years ago

@coco98 can you share your plans whether you will implement rate limiting in some form in the open source version of Hasura, or will it be a Hasura Pro option only (https://www.youtube.com/watch?v=JS6eMQ6H7eA)? Thanks!

ozum commented 4 years ago

@beepsoft, I didn't see that until I read your message.

Hasura is an open source engine that connects to your databases & microservices and flash instantly gives you a production-ready GraphQL API.

I hope Hasura is not becoming a "fake open source". It's understandable that offering paid services and paid support. However, splitting software features as an open source vs. pro seems to me not "open source". Worst of all, offering a fundamental feature such as rate limiting as a "pro" feature... Especially considering without rate limiting, a public facing graphql server is very vulnerable to attacks.

This is just my 2𝇍 as a developer, who publishes and contributes lots of open source projects.

coco98 commented 4 years ago

Hi folks. We’ll always make sure you can run Hasura in production! Most of our time is and will be spent on open-source work.

Hasura Pro add-ons need Hasura (open-source) to have the required building blocks for rate-limiting (for eg: how allow lists work). We’ll be putting up the spec and raising the PR in a bit.

Also, I'm putting together a blogpost to answer how we will manage Hasura open-source soon too. :)

beepsoft commented 4 years ago

Hasura Pro add-ons need Hasura (open-source) to have the required building blocks for rate-limiting (for eg: how allow lists work). We’ll be putting up the spec and raising the PR in a bit.

@coco98 Does this mean that the open-source version of Hasura will definitely have some form of rate-limiting and this will be extended for the pro version?

ozum commented 4 years ago

Any progress on this?

tspike commented 4 years ago

Looks like they've added rate limiting as a Pro-only feature 😞

https://youtu.be/JS6eMQ6H7eA

ozum commented 4 years ago

From https://hasura.io/

Hasura is an open source engine that connects to your databases & microservices and flash instantly gives you a production-ready GraphQL API.

@coco98 said:

Most of our time is and will be spent on open-source work.

Also, I'm putting together a blogpost to answer how we will manage Hasura open-source soon too. ... We’ll be putting up the spec and raising the PR in a bit.

...and this:

https://youtu.be/JS6eMQ6H7eA

I agree with @tspike: 😞

beepsoft commented 4 years ago

Well, I got a thumb up from @coco98 for this question of mine, which means ... something.

@coco98 Does this mean that the open-source version of Hasura will definitely have some form of rate-limiting and this will be extended for the pro version?

ozum commented 4 years ago

@beepsoft I hoped as you thought. However this case has been open for a year and rate limiting seems already be available in "Pro" 🙄 version for months.

Hasura Pro add-ons need Hasura (open-source) to have the required building blocks for rate-limiting (for eg: how allow lists work).

I'm not trying to be rude, but IMHO this sentence does not mean open source version gets rate-limiting. I think, it means we are using open source version and all PRs as a base for our closed version.

beepsoft commented 4 years ago

@ozum I would really like this feature too and although I would not be too happy if it was only available in the Pro version, I would much appreciate a clean statement from Tanmai or someone from Hasua regarding its availability.

So, @coco98, @marionschleifer will we have rate limiting in the OS version or will it be a Pro only feature?

And as always: thank you very much for your efforts!

pcmaffey commented 4 years ago

I believe this issue https://github.com/hasura/graphql-engine/issues/2269 is the only thing blocking the ability to setup rate limiting via authhook. If Hasura enabled the ability for an authhook to set and pass headers back to the client in response, we could use session cookies to manage rate limiting manually.

ozum commented 4 years ago

@pcmaffey #2269 is also important FR. On the other hand rate limiting is not only limiting number of request. A malicious user can DoS every public hasura system with single query deep enough. (Only exception is white listing queries, which I think is a nuclear option)

schettino commented 4 years ago

Looking at the Pricing page, it suggests a lot of essential features are being cut from the OS version in favour of the "Pro" one - in special, rate limiting, which is a mandatory functionality for any production app. I feel that you're clearly moving towards a paid software, and a trend out there is to have another service/product that complements the main one in different ways (infrastructure, support, analytics, e.g. Cypress, Apollo, Prisma). Does this make sense?

I'd love to contribute monetarily, but I don't like the idea to being forced to do it.

Sorry if this is a harsh comment and I might be wrong about some points. In that case, I'd appreciate more transparency on some of the product plans, if that's possible.

Thanks =]

ozum commented 4 years ago

I even agree to be forced to pay if it does not label itself as "open source", because I would know from the beginning that I'm using a propriety software and there is a monetary agreement between me and developer. Otherwise, I could select an open source alternative.

In case of open source; People

  1. Trust the developer,
  2. Invest time to learn and implement the tool in their projects,
  3. Contribute their time and skills free of charge as Pull Requests. (You can check my GitHUb profile to see my open source projects and contributions to others. I couldn't contribute to Hasura, because it is written a language I'm not an expert of.)

From home page of Hasura.com

Hasura is an open source engine that connects to your databases & microservices and auto-generates a production-ready GraphQL backend.

From pricing page of Hasura as @schettino pointed:

Hasura Core Hasura Pro
SECURITY
Session based rate limiting to prevent abuse/DoS attacks    ✔

Hasura claims its open source offering production ready, but leaves it vulnerable to "abuse/DoS attacks" according to their own marketing page. (I also agree with them, it is vulnerable)

mitar commented 4 years ago

I am not sure what is so surprising? Somebody has to pay the bills. This is "open core" business model and for example GitLab is doing pretty well with it. And also their repository is full of issues asking for some feature to be moved to the community edition. It is simply a hard task to navigate this. So please give Hasura a bit of slack here.

Nobody is forcing you to use it. If you do not like it, move along. Open source means that you have freedom to use it, or change the code to suite your needs, not freedom to require others to adapt the code for you.

Or are people who are complaining here already spend hundreds of hours contributing to open source Hasura codebase and now they feel misused? I do not think so.

It is simply a reality that business models change, develop, and features come and go. This is even true for proprietary solutions. How many times it happens that a startup fails and says "in 1 week we are erasing all data". And you do not even have access to the code and cannot really do much.

So, let's move discussion about their pricing and business models somewhere else please, and let's keep this issue about how to get rate limiting (or hooks for rate limiting) to the open source version of Hasura. Nobody gains anything by people repeating a comment which is just glorified +1 on "free rate limiting". Who would not have free stuff. You do not have to comment to express that. Just upvote a comment or two you like about "me wanting free rate limiting" if you really want to express yourself.

tombh commented 4 years ago

Am I missing something? Is DDoS mitigation not best accomplished higher up the stack, nginx for example? Even fine-grained rate limiting can be achieved with nginx. Though of course I do concede that Hasura is best placed for managing recursive queries, etc.

ozum commented 4 years ago

I am not sure what is so surprising? Somebody has to pay the bills.

We can reason everything with this sentence. Hasura developed a great tool ❤ and they are trying to sell it, nobody denies that. Feature discussed here is mission-critical security feature, not something nice to have like "SSO". Do you release anything vulnerable to "abuse/DoS attacks"?

Or are people who are complaining here already spend hundreds of hours contributing to open source Hasura codebase

I spent not hundreds but thousands of hours to open source software either creating or contributing. Am I eligible, may I post?

keep this issue about how to get rate limiting (or hooks for rate limiting) to the open source version of Hasura.

As OP, I pointed a great blog post of Apollo about how to add security measures. @beepsoft pointed an open-source library for scoring. Hasura already implemented rate limiting months ago and deployed. They just cut it from open core version. @beepsoft also asked four months ago a simple question whether open source version would have it.

After that point, what else can we discuss whether this is added to open-source or not?

Could someone from Hasura please kindly answer this question with a "Yes" or "No":

Do open source version will have rate-limiting and other future security features such as scoring etc.?

ozum commented 4 years ago

Is DDoS mitigation not best accomplished higher up the stack, nginx for example?

As of my knowledge: Yes, but partly for graphQL, you need to investigate query to prevent heavy recursive queries resulting DDoS. IMHO, if there is a complete gateway solution for GrapghQL, then that would be a better architecture similar to implemented in REST. Also releases burden of Hasura for security features.

coco98 commented 4 years ago

Hey folks! Sorry for the late reply and appreciate everyone’s kind words, patience and understanding specially @beepsoft and @mitar 🙏

Will there be rate-limiting in Hasura open-source:

  1. Rate limiting controls that affect query planning and generation are and will be in on OSS. Eg: limiting results.
  2. Hasura Pro uses Hasura as a library (it is not a fork!) and is like a gateway to Hasura. For eg: if we do complex query scoring that affects the planner and subsequent query generation, it will land in the open-source version either as an additional control in metadata or as a scoring metric/API that is emitted for any query.
  3. Specifically, for depth-limiting as a rate-limiting feature, we don’t plan for this to be in the open-source version in the immediate future. Currently depth-limiting can be protected against with allow-listing quite easily (and this is a recommended production practice anyway, which is available in open-source). Alternatively, depth limiting can also be done at the API gateway level. Recursive queries are different from vanilla depth limiting and slightly complex because the GraphQL API usage might genuinely need some recursion (a comment thread). In this case, allow-listing is again an ideal solution.

We are also working on a pricing (and offering) for Hasura Pro that will make it accessible for individual developers and small teams and not just large teams and organisations. Meanwhile, you can build something with nginx or a proxy or use existing IaaS tools(like cloudflare + workers), similar to the way non-GraphQL APIs work as well. An open-source contribution for an nginx plugin that would make rate-limiting easy not just for Hasura but for GraphQL in general would be amazing!

If you need more information/metrics from within Hasura to integrate with your current API security stack, please do feel free to open an issue (or ideally an RFC) here!

beepsoft commented 4 years ago

Thanks @coco98 for the detailed explanation!

Allow list seems to be a viable solution for now. My only problem with it is that it is rather difficult to collect all the queries my various modules send to Hasura. Is there some support (planned) for this? For example, a special mode of Hasura - like the migration mode now - which records all the queries sent to Hasura and at the end with a click of a button or a cli call would generate an allow list from the collected queries. I guess this could also be implemented with eg. Apollo acting as a gateway for Hasura and collecting the queries.

ozum commented 4 years ago

@beepsoft, not directly what you want, but I saw this library. It may help:

https://github.com/apollographql/persistgraphql https://github.com/benjie/persistgraphql https://github.com/benjie/persistgraphql-webpack-plugin

beepsoft commented 4 years ago

@ozum analyzing .graphql files is a good idea, thanks! I also have dynamically generated Hasura queries and mutations in which case I can only identify those either when I actually send them to Hasura or when received by Hasura or something in between me and Hasura. I feel there could be existing solutions for this already. Maybe one of these?

https://gitlab.com/arboric/arboric https://github.com/nautilus/gateway

ozum commented 4 years ago

@beepsoft they seem nice and also be useful for other purposes.

AFAIK, white listing requires your queries are fixed before Hasura schema is built, and only GraphQL variables can be changed at runtime. Are you sure you could use white listing for dynamically generated queries?

@coco98, could better answer that, if I'm wrong.

beepsoft commented 4 years ago

@ozum My use case is a (would be) generic form system, which can work with any Hasura schema + some other configurations and generate forms to edit the entities of that schema. So practically these will be the same queries/mutations for all entities but with their specific types and fields included in the query/mutation expressions. Then I would have my tests, which force generation of all the queries possible for the given schema and I would like to have those collected in the allow lists.

Actually I start off from Java/JPA and generate Hasura customizations based on it (using https://github.com/beepsoft/hasuraconf) and now working on adding also json-schema generation (https://github.com/victools/jsonschema-generator) with some custom fields (eg. to describe the relationship between entities, add field validation information, etc.): things that cannot be derived from the Hasura graphql schema itself. Then I have mst-gql (https://github.com/mobxjs/mst-gql) to generate state store and graphql communication handling and will eventually have my components to actually display forms using all the above. And then, in production it would be great to have allow lists generated automatically. :-)

arpitjacob commented 4 years ago

@coco98 thanks for the update on this looking forward to this feature in the OSS version Also any guide out there for best practices on rate limiting? I have an app that I run for a non-profit and has a large user base so we can't afford a big budget for infrastructure

arpitjacob commented 4 years ago

Hi @coco98 any update or timeline when this will be pushed into the OSS version

Bessonov commented 4 years ago

While I'm fully agree with @mitar 's comment, I think a Query Cost Analysis (or at least a Depth Limiting) is crucial to every GraphQL implementation. It's because an attacker must not have a knowledge about the application. Just use introspection to find a cycle. Then the attacker doesn't need a bunch of resources like a botnet. It's just a matter of some copy&paste. And every such query is able to bring the database and therefore the whole application down. Just a single query issued from a mobile phone through 2G network is enough.

In the GraphQL world the rate limiting doesn't make any sense, because you can fetch the whole object graph regardless you need the data or not. It's become very expensive for the service/api provider (shot in own foot). Rate limiting is a thing to bill partners for usage of RESTlike endpoints, even in a wrong way. So, it's really an enterprise feature. But it can be practicable together with the deep limiting. @arpitjacob just for rate limiting look at something like nginx. Nginx is very lightweight and powerful. BTW, I think the auth-hook can be misused for the rate limiting.

Proposed allow-listing isn't practicable for most of the applications. It leads GraphQL ad absurdum. On the one side you should fetch the data you need only (sold as underfetchin/overfetching solver), but on the other side you are forced to use exact predefined queries. It's even worse: "The order of fields in a query will be strictly compared.". W00t? And think about that: how you will manage old apps which are not updated by users (=uncontrolled environments)? For a nearly every small change you are forced to add a new copy of definition to allow list and you can't just delete the old one.

From the fist post you can navigate to the graphql-cost-analysis project. I like the idea to not to try some cost heuristics, but allowing developers to specify arbitrary cost values. It's powerful and makes the implementation easier. The integration examples are very easy with apollo engine and others. But it seems impossible to integrate it with hasura, because hasura has fundamentally another view on architecture.

I see different ways to approach the problem. The worst one is to try to implement something like GraphQL proxy which can be instrumented with costs on queries (and mutations?). But it leads to double maintenance and potentially expose the structure by bypassing permissions on introspection. But it could work for rate limiting and depth limiting. It's possible, that there are already some usable/lightweight solutions. The better two are to implement a similar functionality like in graphql-cost-analysis and/or allows to specify and process custom directives. I think the latter one is fully compatible with community vs. enterprise versions of hasura: for the community we can build own cost limiter and enterprise version can offer one out of the box.

I understand, that hasura must make money. But I hope there is a place for an open source solution to problems with a huge impact.

mitar commented 4 years ago

So current business model is that you get unrestricted queries for free, but for restricted queries you have to pay. Would people prefer that Hasura has a different business model where you get restricted hard-coded queries of depth up to 2 for free, but for unrestricted queries or for other cost metrics you have to pay?

arpitjacob commented 4 years ago

They are offering the following feature on their Free Tier.

API limiting - Add depth-limit and rate-limit rules to prevent abuse of your API.

I don't want to debate or complain, this is such a key features and useful to have it in the Open Source Version. I hope they can push this to the Open Source Version.

Bessonov commented 4 years ago

Would people prefer that Hasura has a different business model where you get restricted hard-coded queries of depth up to 2 for free, but for unrestricted queries or for other cost metrics you have to pay?

I'm not sure what do you mean by "restricted hard-coded queries".

Personally, I very like following business model: free for free AND open source databases (postgres, mariadb, yugabytedb, cockroachdb core) with optional support subscription and billed for commercial databases (enterprisedb, yugabyte cloud/plattform, cockroachdb enterprise/cloud, MS SQL Server, oracle, db2).

hugomn commented 3 years ago

Does anyone already have a solution on how to add rate-limiting to the open-source version? I totally understand @mitar's point and that here maybe is not the best place to discuss Hasura's business model. But I also agree that API rate-limiting is such a basic security concept that no tool should be considered production-ready if it doesn't offer rate limiting.

urgent commented 3 years ago

Does anyone already have a solution on how to add rate-limiting to the open-source version? I totally understand @mitar's point and that here maybe is not the best place to discuss Hasura's business model. But I also agree that API rate-limiting is such a basic security concept that no tool should be considered production-ready if it doesn't offer rate limiting.

Hasura Container + Fail2Ban on host?

jokester commented 3 years ago

Simple rate limiting may be doable with POST webhook since Hasura v2.0, where auth service can see graphql query and decide to permit or forbid it.

Query scoring might be doable with such webhook + external auth service too. While I personally think query allowlist could be an option too: at least it is simpler. (I'm using React + Relay which has an option to export query texts)

ilijaNL commented 2 years ago

Simple rate limiting may be doable with POST webhook since Hasura v2.0, where auth service can see graphql query and decide to permit or forbid it.

Query scoring might be doable with such webhook + external auth service too. While I personally think query allowlist could be an option too: at least it is simpler. (I'm using React + Relay which has an option to export query texts)

This approach is kind of a hack since the webhook is meant to be used as authentication endpoint.

It would be more useful to have a pre-execute hook which is triggered after authentication but before graphql-engine execution. Several things should be possible in this webhook like rejecting the query, validating the input and calculating the cost. Even extending the query could be an option..., see https://github.com/mercurius-js/mercurius/blob/master/docs/hooks.md#preexecution for a possible call signature

hongbo-miao commented 2 years ago

Hasura is an awesome project!

Just provide a workaround way. For rate limiting, here is a way using Traefik as reverse proxy in a sidecar container. Hopefully save some time for the people in future!

For scoring, as Traefik provides many middlewares, and also can write new middleware and plug in, technically we can write a middleware. It would be great if someone write one and publish so everyone can use : )

image image
ilijaNL commented 1 year ago

If people are still wondering how to secure and possibly get cdn caching with hasura, checkout the article I wrote on how to use edge functions (vercel or cloudflare workers) to create a reverse proxy to improve the security: https://ilijanl.hashnode.dev/how-to-secure-and-make-graphql-blazingly-fast

Another alternative is wundergraph