mrahhal / MR.EntityFrameworkCore.KeysetPagination

Keyset/Seek/Cursor pagination for Entity Framework Core.
MIT License
218 stars 11 forks source link

Handle null #45

Closed 0Lucifer0 closed 1 year ago

0Lucifer0 commented 1 year ago

Hi, I'm trying to move my current cursor paging implementation to this library and I'm having some issues with null properties.

reading about that https://github.com/mrahhal/MR.EntityFrameworkCore.KeysetPagination/blob/main/docs/caveats.md#null it does not seems to be supported (we currently need computed column to get rid of the null)

It would be nice to handle null properties for ordering without having to add computed columns.

Wouldn't it be possible to put all the null either at the end or at the beginning by comparing with Null (is null or is not null) on all properties (sql allow to check null on non nullable column) so something like

where 
property1 is null or property1 > cursor.property1 
or 
(property1 = cursor.property1 and (property2 is null or property2 > cursor.property2)
or
(property1 = cursor.property1 
    and (property2 = cursor.property2 and  (property3 is null or property3 > cursor.property3)
)
 var whereExpression = Expression.GreaterThan(Expression.Convert(sortProperties[0], orderProperties[0].Type), orderProperties[0]);
        for (var i = 1; i < sortProperties.Count; i++)
        {
            var prop = ((PropertyInfo)orderProperties[i].Member);
            var filterBy = prop.GetValue(cursor);

            var paramIsNull =
                Expression.OrElse(
                    Expression.Constant(filterBy == null, typeof(bool)),
                    Expression.Equal(Expression.Convert(sortProperties[i - 1], orderProperties[i - 1].Type), orderProperties[i - 1])
                );

            whereExpression = Expression.OrElse(
                whereExpression,
                Expression.AndAlso(
                    paramIsNull,
                    Expression.GreaterThan(Expression.Convert(sortProperties[i], orderProperties[i].Type), orderProperties[i])
                )
            );
        }
mrahhal commented 1 year ago

Hi. There's a way to make this work now actually without a computed column, but first I want to explain why your suggestion doesn't work.

More details and a discussion in https://github.com/mrahhal/MR.EntityFrameworkCore.KeysetPagination/issues/24#issuecomment-1298675672, at the time I deemed this too complicated to handle on the library level, especially with needing to make some db related assumptions.

The solution you suggested doesn't give the expected result. Depending on the sorting behavior of the db the result will change, and no matter the behavior you won't be getting the right results anyway.

Here's what I mean. Trying to get the records after the following reference (data taken from the test project):

Id  CreatedNullable                 CreatedComputed
90  2022-05-14 13:26:38.9979962 2022-05-14 13:26:38.9979962

With a computed column that sets NULL dates to a min value:

SELECT * FROM [KeysetPaginationTest].[dbo].[MainModels]
WHERE (CreatedComputed > '2022-05-14 13:26:38.9979962')
   OR (CreatedComputed = '2022-05-14 13:26:38.9979962' AND Id > 90)
ORDER BY CreatedComputed ASC, Id ASC
Id  CreatedNullable                 CreatedComputed
92  2022-05-14 13:28:38.9979962 2022-05-14 13:28:38.9979962
94  2022-05-14 13:30:38.9979962 2022-05-14 13:30:38.9979962
96  2022-05-14 13:32:38.9979962 2022-05-14 13:32:38.9979962
98  2022-05-14 13:34:38.9979962 2022-05-14 13:34:38.9979962

This is the right result. It has reliable performance and consistent results not matter the db specific behavior.

With your suggested query:

SELECT Id, CreatedNullable, CreatedComputed FROM [KeysetPaginationTest].[dbo].[MainModels]
WHERE (CreatedNullable is NULL OR CreatedNullable > '2022-05-14 13:26:38.9979962')
   OR (CreatedNullable = '2022-05-14 13:26:38.9979962' AND Id > 90)
ORDER BY CreatedNullable ASC, Id ASC
Id  CreatedNullable                 CreatedComputed
1   NULL    1900-01-01 00:00:00.0000000
3   NULL    1900-01-01 00:00:00.0000000
5   NULL    1900-01-01 00:00:00.0000000
7   NULL    1900-01-01 00:00:00.0000000
9   NULL    1900-01-01 00:00:00.0000000
...
93  NULL    1900-01-01 00:00:00.0000000
95  NULL    1900-01-01 00:00:00.0000000
97  NULL    1900-01-01 00:00:00.0000000
99  NULL    1900-01-01 00:00:00.0000000
92  2022-05-14 13:28:38.9979962 2022-05-14 13:28:38.9979962
94  2022-05-14 13:30:38.9979962 2022-05-14 13:30:38.9979962
96  2022-05-14 13:32:38.9979962 2022-05-14 13:32:38.9979962
98  2022-05-14 13:34:38.9979962 2022-05-14 13:34:38.9979962

This is obviously not right as it also selects all records with CreatedNullable == null. And it gets even weirder with a reference that has CreatedNullable == null.

The reason one might think this might work is because they're still thinking in terms of offset pagination. This won't work in keyset.


That said, after https://github.com/mrahhal/MR.EntityFrameworkCore.KeysetPagination/issues/37 was done, you could now move this coalescing from a computed column to inside the keyset itself. An example:

b => b.Descending(x => x.Created ?? DateTime.MinValue).Descending(x => x.Id)

This was only released yesterday so make sure you have the latest version before trying it out.

I still haven't added this alternative to the caveat, but you'll need to be aware that this might not give you the best performance compared to a computed column. I'll need to benchmark this.


You mentioned your current cursor paging implementation, how do you handle the nulls there?

0Lucifer0 commented 1 year ago

I guess the current paging implementation may seems to work but does not in some cases.

Checking a bit more in details to what's happening I think the query is different from what I previously said(The expression code is the same I use tho).

This is the expression generated for 3 parameters

{x => (((Convert(x.OrderingField, Nullable1) > value(ObjectCursor).OrderingField) OrElse ((True OrElse (Convert(x.OrderingField, Nullable1) == value(ObjectCursor).OrderingField)) AndAlso (Convert(x.SecondOrderingField, Nullable1) > value(ObjectCursor).SecondOrderingField))) OrElse ((False OrElse (Convert(x.SecondOrderingField, Nullable1) == value(ObjectCursor).SecondOrderingField)) AndAlso (Convert(x.Id, Nullable`1) > value(ObjectCursor).Id)))}

and the one for 2 parameters {x => ((Convert(x.OrderingField, Nullable1) > value(ObjectCursor).OrderingField) OrElse ((False OrElse (Convert(x.OrderingField, Nullable1) == value(ObjectCursor).OrderingField)) AndAlso (Convert(x.Id, Nullable`1) > value(ObjectCursor).Id)))}

False/True is depending if the cursor property is null or not. It was passing unit tests but there may be some case missing that would fail.

Coalescing seems to do the trick. (It passes all the unit tests that were failing without)

mrahhal commented 1 year ago

Great! Closing this as it seems the issue is completely resolved with keyset coalescing.