Closed atrauzzi closed 1 year ago
On first look it looks fine if as you said the names are different. I'll try to repro your particular example.
But one redudency I'm seeing is that you're calling OrderBy
before KeysetPaginate
. KeysetPaginate
will override any orders you configure, so you don't need to call that yourself. This shouldn't be causing the issue though. You can check MR.AspNetCore.Pagination's code for an example of how to properly do this.
Regarding troubleshooting, I publish the PDBs with all my packages, so you should be able to debug into the library code from your code. That might be helpful. (You can lookup how to do that, I think as a start make sure you don't have "Just my code" checked.)
Cleaned up the redundancy, no impact. Here's a sample of the query being generated when I try to pass a reference:
SELECT TOP(4) [t].[id] AS [Id], [t].[name] AS [Name], [t].[last_modified] AS [LastModified]
FROM [mymodel] AS [t]
WHERE ([t].[organization] = 'SOMEGUID') AND ((CASE
WHEN [t].[last_modified] >= @__p_1 THEN CAST(1 AS bit)
ELSE CAST(0 AS bit)
END & (CASE
WHEN [t].[last_modified] > @__p_1 THEN CAST(1 AS bit)
ELSE CAST(0 AS bit)
END | (CASE
WHEN [t].[last_modified] = @__p_2 THEN CAST(1 AS bit)
ELSE CAST(0 AS bit)
END & CASE
WHEN [t].[name] > @__p_3 THEN CAST(1 AS bit)
ELSE CAST(0 AS bit)
END))) = CAST(1 AS bit))
ORDER BY [t].[last_modified], [t].[name]
I feel like it's taking the reference and creating some kind of impossible filter scenario. Again, the one anomaly in the data is the last_modified
being the same between all the records. But that is something I hoped to see mitigated by the second sort on name
.
Oh, and also, my reference entity is the last record from the previous page. Does that make sense?
Oh, and also, my reference entity is the last record from the previous page. Does that make sense?
Yes, that's the idea.
A quick test I did with a model that has the same date column but different names column worked completely fine. And the generated sql you're showing looks very weird to me. So this is proving hard to reproduce on my side. If you can provide a repro repo that I can just run and inspect I might be able to help further.
Hmm, tricky for me to do at this point. Is there anything specific you might suggest I look at?
Maybe try to simplify the model/query to scope out what's causing this. I see in the generated sql a predicate on an organization
column. Start by removing that maybe. Also, I checked the generated sql on SqlServer in my test, below. It's similar to yours. I didn't know SqlServer generated such noisy sql here. But again, it worked just fine.
SELECT TOP(@__p_3) [i].[Id], [i].[Created], [i].[Name]
FROM [Issue24Models] AS [i]
WHERE (CASE
WHEN [i].[Created] >= @__p_0 THEN CAST(1 AS bit)
ELSE CAST(0 AS bit)
END & (CASE
WHEN [i].[Created] > @__p_0 THEN CAST(1 AS bit)
ELSE CAST(0 AS bit)
END | (CASE
WHEN [i].[Created] = @__p_1 THEN CAST(1 AS bit)
ELSE CAST(0 AS bit)
END & CASE
WHEN [i].[Name] > @__p_2 THEN CAST(1 AS bit)
ELSE CAST(0 AS bit)
END))) = CAST(1 AS bit)
ORDER BY [i].[Created], [i].[Name]
The actual relational expression generated is much simpler, here:
SELECT TOP(@__p_3) i.Id, i.Created, i.Name
FROM Issue24Models AS i
WHERE (i.Created >= @__p_0) & ((i.Created > @__p_0) | ((i.Created == @__p_1) & (i.Name > @__p_2)))
ORDER BY i.Created ASC, i.Name ASC),
In any case, can you show me how you're loading the data? What's after that last Take(*)
line? I'm guessing the code snippet you showed above is not the actual code that's generating that sql. Can you show it all?
@mrahhal -- The only part that's missing is a .ToListAsync()
.
I feel like there's something wrong going on with how the WHERE
s are being generated when there's multiple sort columns. But I just can't seem to figure out how/why.
Especially that my attempt at a repro worked fine. So won't be able to proceed here without a repro.
Also to point out, this library only generates the dynamic C# expressions, and EF Core translates those. You can try and show me the generated expressions, not the sql (like I did above, they should be logged to the same place just before the generated sql log). That might be easier to understand, and can tell me if there's anything wrong with the generated expressions themselves.
I wonder if it's because the date field that I'm trying to include in the sort order is nullable?
I didn't know that. This is why a repro is important 😅 I can actually see there are no results returned if I change it to nullable and make them all null (giving them the same value, as your first comment seemed to suggest, even if it's nullable still returned results). I'll investigate.
So, this occurs when the reference has a null value in its keyset. Comparison operators don't really work with NULLs in SQL and they invalidate the whole WHERE clause, resulting in no results. I missed to account for this.
Unfortunately, this isn't an easy problem to solve, and the best I can say right now is you'll want to avoid having nullable columns in your keyset. This is a problem related to SQL itself, compounded by the fact that the SQL standard does not define a default ordering of NULLs, but mainly because NULL is a very special value in SQL.
I was trying to figure out if I can confidently always do something with nulls in the reference in a predictable way, but I can't even begin to foresee all the different problematic cases. There are a lot of caveats. WHERE Created < "2022"
will never return rows with NULL in Created
, no matter if the DB sorts NULLs first or last.
In addition, while coalescing NULLs into some other value might be a workaround in specific cases, I don't feel it'll work as a general solution in this library.
So this is looking like an unsolvable problem to me. Only thing I can recommend, again, is to not have nullable columns in the keyset. And I'm now beginning to wonder if I should actually throw an exception if that is detected, as I can't think of a single correct use case that can work here.
Also, this is pretty much the same as https://github.com/mrahhal/MR.EntityFrameworkCore.KeysetPagination/issues/18, but I didn't investigate it at the time.
And I also just realized that I don't have good test coverage on this particular issue. I only simply tested that the DB didn't scream at me when there are nulls involved.. which obviously wasn't enough to detect this.
Awesome! Yes, apologies, the repro was a little difficult for me to make without knowing what was causing the problem. I might have ended up creating a repro without the issue and come up with nothing! 😆
I think our conversation did help and I was able to keep poking at it on my end.
What if you made it so that if a column was detected as nullable, the query was augmented with where [column] is not null
, and then that just ensures that the data is never tainted with null values? This is obviously a workaround and would require documentation, but at least people could still paginate on those columns if they so decided to accept the limitation?
First of all you have to decide if you want null values to appear at the beginning or at the end of the sorted values. I am using SQL-Server, which uses NULLS-First (that is NULL values are lower than any other value), so I am doing the same. Once you know where to place null values, you can generate ad-hoc comparison expressions between "field" and "value" in all the possible cases:
I believe if you modify your private static BinaryExpression MakeComparisonExpression
What if you made it so that if a column was detected as nullable, the query was augmented with
where [column] is not null
, and then that just ensures that the data is never tainted with null values?
This will work only for very specific cases, it's riddled with caveats (see @raffaele-cappelli's comment).
@raffaele-cappelli, this is assuming I can tell what the sorting of NULLs will be from this library, which I can't do. Sure, I can make it an option for the user to provide, but this makes it that much more error prone and harder to consume as a library. The alternative of trying to somehow get the registered data provider and map that to a well known sorting order also doesn't appeal to me (especially that this library exposes its apis as extension methods on top of IQueryable).
This is also assuming that we can tell if a column is nullable without going back to check the table's schema, which isn't the case at all. For exmaple, apps that aren't annotated for NRTs. And even for apps that are, users can always configure nullable columns irrelevant of the .net nullability context.
One workaround I can suggest is to create a (non-nullable) computed column that coalesces nulls into some value that makes sense to your specific use case (and results in the sorting you expect), and then use that in the keyset instead. If this works just fine as I'm expecting here, I think this would be the best way to go about solving this, as opposed to introducing flaky complicated code in the library to solve a niche scenario.
That's a great idea.
I verified that using a computed column works properly and is predictable. It also allows you to deterministically specify the sorting of NULLs, irrelevant of the database impl. I'll be adding an example to the tests and sample.
In addition, soon I'll release a new version with an analyzer that detects nullable properties in the keyset and warns about it, pointing to the workaround. This might be better than an exception, as you'll be able to suppress it if you really really want to.
https://github.com/mrahhal/MR.EntityFrameworkCore.KeysetPagination/blob/main/docs/caveats.md#null
Let me know if this explains the problem and the workaround clearly 🙂
I'm having an issue where whenever I try to paginate after a specific record, my result set comes back empty.
The only thing noteworthy about the test data I'm using is that they all have the same
ModifiedUtc
value. Though their names are different.Is there any good way to troubleshoot whatever is going on here?
Thank you for this really cool library, looking forward to mastering using it!