Closed ifokeev closed 2 months ago
@ifokeev Thanks for posting this! There's hard limit for security purpose. Could you please elaborate a little bit on your use case?
There's hard limit for security purpose
This hard limit should be inside playground and not in API. User knows better what SQL he needs while using standalone API. If it's so important so there may be something like skipRestrictions
flag
Could you please elaborate a little bit on your use case?
I just need to query more than 50000
of rows. I use it to predict dimensions, not measures. Now CubeJS limiting my opportunity to do this.
@ifokeev I see. Most of cube.js API are exposed to some sort of untrusted environments directly. Without this check it's very easy to exploit it for DoS attack.
As it was discussed previously I'd suggest to provide separate API for export and downloading results. What do you think?
@paveltiunov
As it was discussed previously I'd suggest to provide separate API for export and downloading results. What do you think?
May be a good solution. I faced at cubejs-server-core
is not extendable by design and I can't change schema validation of cubejs-api-gateway
too.
Other solution could be just to allow user-pass schema for queries or disable it at all. For example, I use GraphQL API and don't need validation from api-gateway
.
Most of cube.js API are exposed to some sort of untrusted environments directly. Without this check it's very easy to exploit it for DoS attack.
I dont think it is cube.js responsibility to protect against DoS attacks. Having a default value limiting the results is good, but I should be able override that value to something greater if my requirements are such. If I do then it is my responsibility to protect against DoS attacks. In my case, I am using cube.js in an internal trusted corporate environment where I do not have to worry about DoS attacks.
@rickj33 Hey Rick! I think we can consider adding option to override default limit on server. Could you please elaborate on your use case though? Is it browser that requires to load more results or some other service that just hits cube.js API? How many rows in total do you need to download?
@paveltiunov let's go with overriding default validation schema, not only the limit
That sounds good.
@ifokeev Could you please elaborate?
@paveltiunov I mean we need user defined schema here: https://github.com/cube-js/cube.js/blob/master/packages/cubejs-api-gateway/index.js#L93
Allow user pass his schema or disable it at all
+1 on api export
@paveltiunov Here's an use case I have - we've setup our own query 'playground' front end for people to build into an angular dashboard app. When they set a filter, we submit a query for that dimension and same other filters to populate an autocomplete dropdown. Sometimes they select a field that has more than 50k potential items in it; and as such, they don't return all values and end up thinking that the system is 'missing' things. Granted, in some cases we could work around this using paging/offset etc, but we can't count on being able to do that 100% of the time. I personally would love to be able to override the query limit on queries.
@JoshMentzer I see. Do you show all the 50k potential items in dropdown? Or do you provide any kind of search by name functionality here?
@paveltiunov No, don't show all 50k - just ones that match what they type in the autocomplete/search ahead whatever you want to call that type of control. We are using virtual scrolling there, so I'm sure we could tie in the query with the filter on a 'contains' or some such instead of the direct search of the list that comes back from the query; is really just choice we made for UX reasons; if it has to go back and hit cube for a query, 'feels' much less performant. We would of course do that if we were concerned over user system resources, etc, but in our case we have a controlled audience and can make the trade off for the performant feel vs use of resources.
We're doing something similar to @JoshMentzer in that we're allowing authorized users to run queries to build lists of people/users. Said lists only need to show a preview on the client so we could use something like the ?limit
or ?offset
parameters to truncate results. However, we'd need to show the total number of results from our BQ database based on a secondary provided query.
somewhat related: https://cube-js.slack.com/archives/CC0403RRR/p1611741898136000?thread_ts=1611718901.127000&cid=CC0403RRR
Is it possible to override the hard limit of 50.000?
@paveltiunov Being able to override the 50000 limit on queries would be very useful for my use case. It's a sensible default, but it would be great to be able to configure it. Are you working on this, if not, would you consider merging a PR from me, if I find time to do it?
@paveltiunov We are also having issues because of this limitation, any plan to get it added in future release?
@paveltiunov we are facing the same issues when using CUBE to populate our BOARD data model for reporting purposes. Has a solution been found? or at least a temporary work around?
CUBEJS_DB_QUERY_LIMIT
can be used to override the default limit. Setting it to big values may cause out-of-memory crashes.
Hello @paveltiunov, does changing this parameter change the default value for SQL API?
Hey all, I've added a section in the docs (https://cube.dev/docs/product/data-modeling/queries#row-limit) that explains the row limit as well as the CUBEJS_DB_QUERY_LIMIT
environment variable that you can use to bump it. Just be advised that bumping the row limit substantially may cause out-of-memory (OOM) crashes.
I'm keeping this issue open to further track the possible introduction of a "Download/Export API."
I've tried setting the env var but the cube-generated SQL query still has LIMIT 10000
, regardless.
This is when calling the Cube API with Graphql and with Cube Docker Image v0.35.47
I've tried both CUBEJS_DB_QUERY_LIMIT=50000
and CUBEJS_DB_QUERY_LIMIT="50000"
Edit: I misunderstood the use of the ENV Var. You can specify a limit in your query up to 50000, without changing the env var, got it. e.g. query CubeQuery { cube(limit: 50000, where(...
Streaming mode in the SQL API removes this limitation.
Describe the bug
api-gateway
doesn't allow to pass limit over50000
https://github.com/cube-js/cube.js/blob/1a8260522a234ebaabb9360f58e9b62095fd87f7/packages/cubejs-api-gateway/index.js#L133 and that's strange. I have no ability to use offset, but I want to increase the limit.Expected behavior There should be no limit in
api-gateway
but may be inplayground
Updated by @igorlukanin. See partial solution: https://github.com/cube-js/cube/issues/251#issuecomment-2046918025