Open Arachnid opened 5 years ago
Settings offsets is not the most convenient API and we need good pagination support, connections seem like a good model to follow.
offsets from start or end, which likely won't scale well if paging over a large dataset
Could you elaborate on what's the issue you're envisioning here?
Could you elaborate on what's the issue you're envisioning here?
In most database systems, a query like SELECT * FROM table LIMIT x OFFSET y
involves the database internally iterating over and discarding the first y results. This results in the cost of paginating over a large dataset being O(n^2)
instead of O(n)
. Using cursors, in contrast, doesn't suffer from this issue.
@Arachnid I see. Though that seems to be more a concern of implementation than of graphql interface. We could do a good implementation of graphql offsets that doesn't use OFFSET
, and it's also possible to do a bad implementation of cursors that does use OFFSET
on the DB.
True, but I don't think it's possible (at least without low level DB support) to do a good implementation that uses offsets - so better to fix the API early.
-Nick
On Tue, 18 Dec 2018, 05:33 Leonardo Yvens, notifications@github.com wrote:
@Arachnid https://github.com/Arachnid I see. Though that seems to be more a concern of implementation than of graphql interface. We could do a good implementation of graphql offsets that doesn't use OFFSET, and it's also possible to do a bad implementation of cursors that does use OFFSET on the DB.
— You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub https://github.com/graphprotocol/graph-node/issues/613#issuecomment-447908816, or mute the thread https://github.com/notifications/unsubscribe-auth/AABFyUH_o306Ed2dlvxJrf5RH4Xq-LIVks5u58dHgaJpZM4Y43Ry .
I wrote a little utility hook that takes care of automatically scraping the endpoint for more results (using skip & limit parameters) until it's exhausted:
import { useQuery } from '@apollo/react-hooks';
import { useRef, useEffect } from 'react';
import { DocumentNode } from 'graphql';
type QueryPair = [DocumentNode, DocumentNode];
type ProceedOrNotFn = (result: any, expected: number) => boolean;
export function useScrapingQuery([query, more]: QueryPair, proceed: ProceedOrNotFn, props?: any) {
const limit = (props.variables && props.variables.limit) || 100;
const skip = useRef((props.variables && props.variables.skip) || 0);
const result = useQuery(query, {
...props,
variables: {
...(props && props.variables),
limit,
skip,
},
});
useEffect(() => {
if (!!result.loading || !!result.error || !proceed(result.data, skip.current + limit)) {
return;
}
result.fetchMore({
query: more,
variables: {
...result.variables,
skip: skip.current + limit,
},
updateQuery: (previous, options) => {
skip.current = skip.current + limit;
const moreResult = options.fetchMoreResult;
const output = Object.keys(moreResult).reduce(
(carry, current) => ({
...carry,
[current]: carry[current].concat(moreResult[current] || []),
}),
previous,
);
return output;
},
});
}, [result, skip.current]);
return result;
}
Basically, you pass a query tuple (first query mandatory, second is optional to provide a custom query for the "fetch more" logic (e.g. if the first query has other, non-paginated fields in it).
Example:
import gql from 'graphql-tag';
export const FundOverviewQuery = gql`
query FundOverviewQuery($limit: Int!) {
funds(orderBy: name, first: $limit) {
id
name
gav
grossSharePrice
isShutdown
creationTime
}
nonPaginatedQueryField(orderBy: timestamp) {
...
}
}
`;
export const FundOverviewContinueQuery = gql`
query FundOverviewContinueQuery($limit: Int!, $skip: Int!) {
funds(orderBy: name, first: $limit, skip: $skip) {
id
name
gav
grossSharePrice
isShutdown
creationTime
}
}
`;
It uses the "limit" and "skip" query variables. The hook automatically adds these by default.
Additionally, you need to provide a callback that checks if more needs to be fetched after each cycle.
Full usage example:
const FundList: React.FunctionComponent<FundListProps> = props => {
const proceed = (current: any, expected: number) => {
if (current.funds && current.funds.length === expected) {
return true;
}
return false;
};
const result = useScrapingQuery([FundOverviewQuery, FundOverviewScrapingQuery], proceed, {
ssr: false,
});
return <div>{...}</div>; // Render full fund list (keeps adding more items until the resource is exhausted.
}
I also have same problem in our project. If we don't use any where clause, we can simply save total count in a schema and use that, but we are using complex where clause and it's impossible to save all count of items filtered by each queries.
I want to request a feature that you can provide in the following way.
// assume I have entity like this type Token @entity { ID String! price BigInt! }
// then we can query like this query { tokens(where: {price_gt:"30"}) { ID } }
// in this case can we use like this? query { countOf: tokens(where: {price_gt:"30"}) { count } tokens(where: {price_gt:"30"}, first:1000) { ID } } // if we use special alias like "countOf", can you return one entity that has field count?
I think it's not too difficult to add this feature in your dev team. If you guys don't have time, I can work with you to add this feature. Thanks
Just adding my thoughts here.
Today, pagination is implemented on the root of every Query type, and returns a ListType
of an entity.
We can implement Cursor-based pagination (see spec here https://relay.dev/graphql/connections.htm). It's supported in all popular clients, and makes pagination super easy and robust (since it's cursor based, so it's easier to get a reliable response, instead of using skip
).
We can expose a Connection
type on the root Query
, without changing the existing - the new field can co-exists with the current API without breaking changes.
Here's an example:
type Query {
purpose(id: ID): Purpose!
purposes(filter: PurposeFilter): [Purpose!]!
purposeConnection(filter: PurposeFilter, paginate: PaginationFilter): PurposeConnection!
}
input PaginationFilter {
before: String
after: String
first: Int
last: Int
}
type Purpose { ... }
type PageInfo {
hasNextPage: Boolean!
hasPreviousPage: Boolean!
startCursor: String
endCursor: String
}
type PurposeEdge {
node: Purpose
cursor: String!
}
type PurposeConnection {
pageInfo: PageInfo!
edges: [PurposeEdge]!
}
Having count
aggregation would be very useful for pagination and displaying information in UIs. For example when filtering with where
you could also include count
aggregate with the same conditions and then have a page UI something like:
Found 49 tokens
(show first 10 tokens)
[1] [2] [3] [4]
Is this still being worked on? Pagination with lots of historical data is a huge pain, and applying offsets really, really doesn't scale.
I think it would be useful to add counter / cursor as pagination. Any idea if this feature will be supported?
Presently, it's possible to query entities using a
where
clause, but this uses offsets from start or end, which likely won't scale well if paging over a large dataset. It'd be good to use the graphqlconnection
pattern, or something similar, where result sets return an opaque cursor that can be passed in on subsequent calls to pick up where the previous query left off.