Is it Still Recommended to use this Project?

Presipope commented 1 year ago

(I would have done a discussion but it doesn't seem to be possible in this repo)

It seems like a lot of things are out of date and a pretty old pull request to stop using clientmutationid is still sitting around.

Most examples I can find online are still using dependency injection inside of the payload objects but that doesn't' seem to jive with the newer graphql-dotnet pattern of using the field builder to create scopes, assign services, arguments, etc. I've had issues here and there using async methods, and the examples in the to-do application aren't very helpful here either.

I've cleared a lot of hurdles already in my journey to use relay in our application but it seems like I keep running into roadblocks and I'd rather know up front if it's worth tinkering with this library or not.

So base question, is it still worth using this library in it's current state or should I try to somehow roll my own implementation (which is a huge longshot)?

Shane32 commented 1 year ago

I find that the answer to this is highly dependent on the performance level you are aiming for and the capabilities. For instance, I believe with this library, the cursors are a prefix plus an offset. See here. This means that if a row is removed in the first page before the second page is requested, there is an item missing between the last row of the first page and the first row of the second page. It also means that if the rows are served by a database, the database may need to read through all the rows on the first page before it can read from the second page, even though those rows are skipped (see here). There is only a single method that uses an IQueryable at all, and it contains a ToList that is not asynchronous, nor is it cancelable. (Note: ToListAsync is a feature of a database library, such as Entity Framework or linq2db, making asynchronous support unfortunately tied to the database framework.)

For an ideal implementation, the cursors need to contain the value of a unique index based on the sort of the returned rows. For instance, if using integer primary keys, with rows that have primary keys of 1, 3, 5, 10, and 11, an easy way is to sort by primary key and have the cursor match the primary key. Then the database need not process any prior keys (1, 3, 5) if requesting a page that starts at id 10. It also does not have issues with rows added or deleted between calls. However, if sorting by, say, last name, which is indexed but not unique, you need to have your database sort by last name and also secondarily the primary key, and generate cursors that are a combination, such as "Jones-394" for last name Jones having primary key 394. Then again this works for an unbounded list of records (millions if you like), with perfectly efficient retrieval of any page. It just requires custom code for each sort order that you support on the returned rows (as commonly search results and similar have multiple sort options).

This library can't help with either of those examples in its current form. As such, I do not use it in any of my production applications. However, for a small personal project, it's probably just fine.

Shane32 commented 1 year ago

You also may wish to combine the use of data loaders with the connection type, which isn't supported by this package either. This is somewhat important, as a speed trick to SQL queries when skip/take are in play is to only return the ids and then return full rows after the ids are known. (This can be done internal to the SQL query, or externally as would naturally occur with use of a data loader.)

Presipope commented 1 year ago

Very interesting! Thanks a ton for the insight @Shane32. I started rolling my own last night taking this project for inspiration/a starting point.

The connection piece was what I was most worried about tackling so I'm grateful you were able to provide me a ton of info about those problems. Your SQL example will be huge as well since that's the database I use, and I was theory crafting how that would work in my head.

Shane32 commented 1 year ago

Just FYI, if, within the connection resolver, you use the IResolveFieldContext.SubFields property to determine if totalCount was requested, you can save yourself a call to .CountAsync() when not requested, which depending on the number of rows in the table, could be costly to execute. But note the comments on the SubFields property about its restrictions.

Shane32 commented 1 year ago

And note that I have not used the 'ideal implementation' I described above, as the level of complexity did not justify the use case for my scenario. (But my code performs other optimizations.)

And here's a link on fetch/offset optimizations:

https://sqlperformance.com/2015/01/t-sql-queries/pagination-with-offset-fetch

graphql-dotnet / relay

Is it Still Recommended to use this Project? #217