duffelhq / paginator

Cursor-based pagination for Elixir Ecto
MIT License
750 stars 90 forks source link

Pagination does not work with joins #59

Open AndrewDryga opened 5 years ago

AndrewDryga commented 5 years ago

If the query has joins, the limit you apply on a query would also count in a number of joined records. In result, you would get fewer results than you expect or, if there is a lot of joined records, one record without all joined assocs.

This is a classic error, which can be worked around like this:

select * from (select * from users limit 10 offset 10)  as u
left join files f
   on u.id = f.user_id

But the library sets the limit by itself, which does not allow an application developer to wrap a query in a subquery.

AndrewDryga commented 5 years ago

Related Ecto issue; https://github.com/elixir-ecto/ecto/issues/2926

bencoppock commented 3 years ago

@AndrewDryga is this still an issue? Or did the fix for the related Ecto issue fix the issue in Paginator as well?

AndrewDryga commented 3 years ago

@bencoppock this is still an issue, internally we use our own fork of paginator that removes Ecto preloads for all one-to-many assocs when they are made on a paginated query and preloads them after the request in separate queries. But the best solution would be to construct a proper query that doesn't break with pagination.

It's important to notice that this library is probably still the best one our community has at the moment, so if you looking for one - use it. Other pagination libs are using offset paging that doesn't scale and suffer from similar problems.

bencoppock commented 3 years ago

Thanks, @AndrewDryga. Yeah, we definitely want keyset pagination as opposed to offset-based pagination, so that's what led me here. However, the fact that this library drops records with null in the cursor values is a non-starter for us unfortunately.

I see that there's a fork called Quarto that fixes the NULLs issue. I'm currently investigating that to see if it'll also support paginating by dynamic values (i.e. values that aren't directly in the database but can be calculated in the database using other values)…

sgerrand commented 3 years ago

@AndrewDryga, are you able to share the changes you made to paginator? We'd like to cover as much surface area as possible, so understanding your use cases and the implementation that was required would be really valuable.

AndrewDryga commented 3 years ago

@sgerrand we altered the paginator itself but for other purposes (having cursor on COALESCE(field1, field2)). For pagination with joins we wrote our own wrapper that uses Ecto reflection that I've described above but it's aside of paginator library and would be hard to share :(

glennr commented 3 years ago

I'm currently investigating that to see if it'll also support paginating by dynamic values (i.e. values that aren't directly in the database but can be calculated in the database using other values)…

@bencoppock did you find a solution there?

bencoppock commented 3 years ago

@glennr it's been a few months since I looked into this, and I'm just now about to get back into it. From what I recall, Quarto seemed an improvement (in that I believe it handled NULLs better), but it also had a couple issues that might block us from using it. Namely, when I last checked:

AndrewDryga commented 3 years ago

@bencoppock it does work with both of your use cases but I think the second one is out of scope for any pagination library.

it didn't support paginating from the end of the result set backward

If I remember correctly you can add order_by: ... to your paginated query and also define cursors with direction, so pagination from the end of the list is the same for paginator as paginating from its beginning. Basically, it doesn't care about the order.

it didn't support jumping to an arbitrary point and paginating forward/backward from there

And it should not do it for you, but you still can do it yourself. From your example, you can SELECT cursor_field FROM users WHERE last_name ILIKE 'J%' ORDER BY last_name LIMIT 1 and then by having cursor field on hand you can separately query a page before and after that cursor using pagination library. (Or you can UNION two queries that do that which is fancier, but writing such query in Ecto and maintaining it might not be worth the tiny speed optimization you will get out of it.)