zompinc / efcore-extensions

Provides window (analytics) functions and binary functions for EF Core. Providers: SQL Server, SQLite, Postgres.
MIT License
60 stars 6 forks source link

Windows function inside of query that will be joined as a table leads to invalid SQL #5

Open Ephron-WL opened 1 year ago

Ephron-WL commented 1 year ago

I was going to use a CTE for this, but linq2db can't figure out how to handle SQL Server temporal tables.

The query below translates to invalid SQL om efcore-extensions:

from change0 in changeSet.TemporalAll()
join change1 in
    (
        from change in changeSet.TemporalAll()
        where change.Id == key
        select new
        {
            change.Id,
            PeriodStart = EF.Property<DateTime>(change, "PeriodStart"),
            RowNumber = EF.Functions.RowNumber(
                EF.Functions.Over().OrderBy(EF.Property<DateTime>(change, "PeriodStart")).PartitionBy(change.Id)
            )
        }
    ) on new { change0.Id, PeriodStart = EF.Property<DateTime>(change0, "PeriodStart") } equals new { change1.Id, change1.PeriodStart }
where change1.RowNumber <= 2
select change0.Id

Produces this...

SELECT [c].[Id]
FROM
    [Ephron.Metadata.DotNet].[Class] FOR SYSTEM_TIME ALL AS [c]
    INNER JOIN
    (
        SELECT [c0].[Id]
              ,[c0].[PeriodStart]
        FROM
            [Ephron.Metadata.DotNet].[Class] FOR SYSTEM_TIME ALL AS [c0]
        WHERE
            [c0].[Id] = @__key_0
    )                                                    AS [t]
        ON [c].[Id] = [t].[Id]
           AND [c].[PeriodStart] = [t].[PeriodStart]
WHERE
    ROW_NUMBER() OVER (PARTITION BY [t].[Id] ORDER BY [t].[PeriodStart]) <= 2;

Should produce...

SELECT [c].[Id]
FROM
    [Ephron.Metadata.DotNet].[Class] FOR SYSTEM_TIME ALL AS [c]
    INNER JOIN
    (
        SELECT [c0].[Id]
              ,[c0].[PeriodStart]
              ,ROW_NUMBER() OVER (PARTITION BY [t].[Id] ORDER BY [t].[PeriodStart])  [RowNumber]
        FROM
            [Ephron.Metadata.DotNet].[Class] FOR SYSTEM_TIME ALL AS [c0]
        WHERE
            [c0].[Id] = @__key_0
    )                                                    AS [t]
        ON [c].[Id] = [t].[Id]
           AND [c].[PeriodStart] = [t].[PeriodStart]
WHERE
    [t].[RowNumber] <= 2;

Below is the expression tree of the LINQ query, which is a little overcomplicated by calling a number of EF Core methods inline.

// using static System.Linq.Expressions.Expression

var key = Parameter(
    typeof(int),
    "key"
);
var change = Parameter(
    typeof(Class),
    "change"
);
var change = Parameter(
    typeof(Class),
    "change"
);
var change0 = Parameter(
    typeof(Class),
    "change0"
);
var change1 = Parameter(
    typeof(<anonymous({ int Id, DateTime PeriodStart, long RowNumber })>),
    "change1"
);
var change0 = Parameter(
    typeof(Class),
    "change0"
);
var change1 = Parameter(
    typeof(<anonymous({ int Id, DateTime PeriodStart, long RowNumber })>),
    "change1"
);
var <>h__TransparentIdentifier0 = Parameter(
    typeof(<anonymous({ Class change0, { int Id, DateTime PeriodStart, long RowNumber } change1 })>),
    "<>h__TransparentIdentifier0"
);
var <>h__TransparentIdentifier0 = Parameter(
    typeof(<anonymous({ Class change0, { int Id, DateTime PeriodStart, long RowNumber } change1 })>),
    "<>h__TransparentIdentifier0"
);

Call(
    typeof(Queryable).GetMethod("Select", new[] { typeof(IQueryable<{ Class change0, { int Id, DateTime PeriodStart, long RowNumber } change1 }>), typeof(Expression<Func<{ Class change0, { int Id, DateTime PeriodStart, long RowNumber } change1 }, int>>) }),
    Call(
        typeof(Queryable).GetMethod("Where", new[] { typeof(IQueryable<{ Class change0, { int Id, DateTime PeriodStart, long RowNumber } change1 }>), typeof(Expression<Func<{ Class change0, { int Id, DateTime PeriodStart, long RowNumber } change1 }, bool>>) }),
        Call(
            typeof(Queryable).GetMethod("Join", new[] { typeof(IQueryable<Class>), typeof(IEnumerable<{ int Id, DateTime PeriodStart, long RowNumber }>), typeof(Expression<Func<Class, { int Id, DateTime PeriodStart }>>), typeof(Expression<Func<{ int Id, DateTime PeriodStart, long RowNumber }, { int Id, DateTime PeriodStart }>>), typeof(Expression<Func<Class, { int Id, DateTime PeriodStart, long RowNumber }, { Class change0, { int Id, DateTime PeriodStart, long RowNumber } change1 }>>) }),
            Call(
                typeof(EntityFrameworkQueryableExtensions).GetMethod("AsNoTracking"),
                --
-- Not implemented - NodeType: Extension not implemented.
--
            ),
            Call(
                typeof(Queryable).GetMethod("Select", new[] { typeof(IQueryable<Class>), typeof(Expression<Func<Class, { int Id, DateTime PeriodStart, long RowNumber }>>) }),
                Call(
                    typeof(Queryable).GetMethod("Where", new[] { typeof(IQueryable<Class>), typeof(Expression<Func<Class, bool>>) }),
                    Call(
                        typeof(EntityFrameworkQueryableExtensions).GetMethod("AsNoTracking"),
                        --
-- Not implemented - NodeType: Extension not implemented.
--
                    ),
                    Quote(
                        Lambda(
                            Equal(
                                MakeMemberAccess(change,
                                    typeof(ITrackedEntity).GetProperty("Id")
                                ),
                                key
                            ),
                            change
                        )
                    )
                ),
                Quote(
                    Lambda(
                        New(
                            typeof(<anonymous({ int Id, DateTime PeriodStart, long RowNumber })>).GetConstructor(new[] { typeof(int), typeof(DateTime), typeof(long) }),
                            MakeMemberAccess(change,
                                typeof(ITrackedEntity).GetProperty("Id")
                            ),
                            Call(
                                typeof(EF).GetMethod("Property"),
                                change,
                                Constant("PeriodStart")
                            ),
                            Call(
                                typeof(DbFunctionsExtensions).GetMethod("RowNumber"),
                                MakeMemberAccess(null,
                                    typeof(EF).GetProperty("Functions")
                                ),
                                Call(
                                    typeof(DbFunctionsExtensions).GetMethod("PartitionBy"),
                                    Call(
                                        typeof(DbFunctionsExtensions).GetMethod("OrderBy"),
                                        Call(
                                            typeof(DbFunctionsExtensions).GetMethod("Over"),
                                            MakeMemberAccess(null,
                                                typeof(EF).GetProperty("Functions")
                                            )
                                        ),
                                        Call(
                                            typeof(EF).GetMethod("Property"),
                                            change,
                                            Constant("PeriodStart")
                                        )
                                    ),
                                    MakeMemberAccess(change,
                                        typeof(ITrackedEntity).GetProperty("Id")
                                    )
                                )
                            )
                        ),
                        change
                    )
                )
            ),
            Quote(
                Lambda(
                    New(
                        typeof(<anonymous({ int Id, DateTime PeriodStart })>).GetConstructor(new[] { typeof(int), typeof(DateTime) }),
                        MakeMemberAccess(change0,
                            typeof(ITrackedEntity).GetProperty("Id")
                        ),
                        Call(
                            typeof(EF).GetMethod("Property"),
                            change0,
                            Constant("PeriodStart")
                        )
                    ),
                    change0
                )
            ),
            Quote(
                Lambda(
                    New(
                        typeof(<anonymous({ int Id, DateTime PeriodStart })>).GetConstructor(new[] { typeof(int), typeof(DateTime) }),
                        MakeMemberAccess(change1,
                            typeof(<anonymous({ int Id, DateTime PeriodStart, long RowNumber })>).GetProperty("Id")
                        ),
                        MakeMemberAccess(change1,
                            typeof(<anonymous({ int Id, DateTime PeriodStart, long RowNumber })>).GetProperty("PeriodStart")
                        )
                    ),
                    change1
                )
            ),
            Quote(
                Lambda(
                    New(
                        typeof(<anonymous({ Class change0, { int Id, DateTime PeriodStart, long RowNumber } change1 })>).GetConstructor(new[] { typeof(Class), typeof(<anonymous({ int Id, DateTime PeriodStart, long RowNumber })>) }),
                        change0, change1
                    ),
                    change0, change1
                )
            )
        ),
        Quote(
            Lambda(
                LessThanOrEqual(
                    MakeMemberAccess(
                        MakeMemberAccess(<>h__TransparentIdentifier0,
                            typeof(<anonymous({ Class change0, { int Id, DateTime PeriodStart, long RowNumber } change1 })>).GetProperty("change1")
                        ),
                        typeof(<anonymous({ int Id, DateTime PeriodStart, long RowNumber })>).GetProperty("RowNumber")
                    ),
                    Constant(2)
                ),
                <>h__TransparentIdentifier0
            )
        )
    ),
    Quote(
        Lambda(
            MakeMemberAccess(
                MakeMemberAccess(<>h__TransparentIdentifier0,
                    typeof(<anonymous({ Class change0, { int Id, DateTime PeriodStart, long RowNumber } change1 })>).GetProperty("change0")
                ),
                typeof(ITrackedEntity).GetProperty("Id")
            ),
            <>h__TransparentIdentifier0
        )
    )
)
virzak commented 1 year ago

Sorry, this is a bit hard to follow. Do you mind creating either a fiddle or a small project? Here is an example of a fiddle: https://dotnetfiddle.net/RtYk51

Ephron-WL commented 1 year ago

Yes, I understand. I have created an example using your fiddle. However, I'm having a hard time putting a query together that can't be optimized that matches my complex query above.

If we were supporting CTEs, I'd have created a CTE to obtain the row numbers using your window functions and then I would have selected the CTE result constraining it by the row number. Without a CTE, this can be performed using a join to a table constructed by a query. While simple in SQL, it is more complex in LINQ. We have to join the query that would be in a CTE to itself. Under the circumstances in my original post above you actually see a SQL join with a table created from a query. This was my intented illustration. That query-generating table (I don't know the technical term for this) should be calculating the RowNumber column and some other columns that can be used to perform the join, etc. Then, I should be able to constrain my final results using the RowNumber in the where clause. The query SELECT RowNumber =... FROM TABLE WHERE RowNumber <= 2; is not valid, because you can't constrain on a window function or its result in the same query, which I suspect you know. So, you have to layer your query with a CTE or table-generating query.

My simple example below is being optimized and the joined table-generating query is being simplified away. There were two problems with the example above: 1) the select clause of the inner table-generating query is missing the window function, i.e. it was not translated because it was in table-generated query that was part of a join and the translator didn't account for it and 2) the window function was placed into the where clause, which of course is not valid SQL. In the sample below, only #2 occurs because of optimizations. This is understandable because the translation engine is thinking that any expression representing a column value can be placed into the where clause, which is generally true, except for window functions. Of course, EF Core wasn't developed with window functions in mind.

I'd be happy to work with you on trying to apply this fix, time permitting. However, being more familiar with the library, if you can think of how to quickly apply a fix or improve the outcome, maybe you are the better person to do so.

var query2 = 
    (from testRow in context.TestRows
    join testRow2 in (
        from subRow in context.TestRows
        select new {
                subRow.Id,
                RowNumber = EF.Functions.RowNumber(EF.Functions.Over().OrderBy(subRow.Date).PartitionBy(subRow.Col1))
            }
    ) on testRow.Id equals testRow2.Id
    where testRow.Id > 1 && testRow2.RowNumber <= 2
    select testRow2.Id);

Console.WriteLine(query2.ToQueryString());

... produces...

SELECT [t0].[Id]
FROM
    [TestRows]            AS [t]
    INNER JOIN [TestRows] AS [t0]
        ON [t].[Id] = [t0].[Id]
WHERE
    [t].[Id] > 1
    AND ROW_NUMBER() OVER (PARTITION BY [t0].[Col1] ORDER BY [t0].[Date]) <= 2;

What we really need to see is something like this:

SELECT [t0].[Id]
FROM
    [TestRows]            AS [t]
    INNER JOIN (
        SELECT
            Id,
            RowNumber = ROW_NUMBER() OVER (PARTITION BY [t0].[Col1] ORDER BY [t0].[Date])
        FROM
            [TestRows]
    ) AS [t0]
        ON [t].[Id] = [t0].[Id]
WHERE
    [t].[Id] > 1
    AND RowNumber <= 2;
Ephron-WL commented 1 year ago

I think the translator has to see the window function as an expression it can't optimize away, like it normally would with any other expression. By selecting the entire joined table row it selects the window function, which while basically valid, is of no use in this context and really must be abstracted into another the inner table.

var query2 = 
    (from testRow in context.TestRows
    join testRow2 in (
        from subRow in context.TestRows
        select new {
                AnotherId = subRow.Id/2,
                Id = subRow.Id,
                RowNumber = EF.Functions.RowNumber(EF.Functions.Over().OrderBy(subRow.Date).PartitionBy(subRow.Col1))
            }
    ) on testRow.Id equals testRow2.Id
    where testRow.Id > 1 && testRow2.RowNumber <= 2
    select testRow2);

... producers...

SELECT [t0].[Id] / 2                                                     AS [AnotherId]
      ,[t0].[Id]
      ,ROW_NUMBER() OVER (PARTITION BY [t0].[Col1] ORDER BY [t0].[Date]) AS [RowNumber]
FROM
    [TestRows]            AS [t]
    INNER JOIN [TestRows] AS [t0]
        ON [t].[Id] = [t0].[Id]
WHERE
    [t].[Id] > 1
    AND ROW_NUMBER() OVER (PARTITION BY [t0].[Col1] ORDER BY [t0].[Date]) <= 2;
virzak commented 1 year ago

You can use AsSubQuery() from Thinktecture library. It works great, here is the fiddle: https://dotnetfiddle.net/dkZoV0

Steps are:

Add nuget package Thinktecture.EntityFrameworkCore.SqlServer Register with .AddCustomQueryableMethodTranslatingExpressionVisitorFactory() Wrap the use of Window Function with AsSubQuery()

The question is: should this library handle a case like this without forcing users to install another extension...

Ephron-WL commented 1 year ago

I'm going to see if I can implement subquery-with-window-function detection after a join (inner) and then construct the SQL accordingly. It is already taking me some time. I'll let you know how it goes.

Ephron-WL commented 1 year ago

Inner joins are created in the SelectExpression. It is not practical to try to supplement or replace the code in that class.

I do think Pawel's approach is probably best. I was able to integrate Pawel's AsSubQuery pattern into your library so that you will not have any dependencies on his library. Is that ok with you? Maybe you can give him credit in your readme. Also, I think you should add to the readme how to use the method since it allows users who do not already know how to use a single query to constrain their window function and that should be more performant than two calls.

I didn't create a proper unit test. The one I created was only for debugging. Some of the type and member names could be improved upon.

Ephron-WL commented 1 year ago

I synced my fork and see that I'm having difficulty with the SqlServer assembly. My thoughts follow:

image

Get rid of support for .net7.0, leave .net6.0, but upgrade the dependency to Microsoft.EntityFrameworkCore.SqlServer 7.0 and then upgrade your release to reflect the new version of the dependency. That way you are supporting .net6 and 7 target frameworks and the newer SqlServer library. Since you are upgrading your version to account for the breaking changes in EF Core Sql Server, you should be good. You should be able to do a minor version upgrade since no public APIs change, only internals/dependencies. I think. I'm not positive how versions are supposed to change when required dependencies change.

I think you can upgrade your EF Core and Relational dependencies to 7 as well. I have been using them .net6 without an issue.

You should only want to support .net7.0 when you are actually coding in C#11 or you want any SDK benefits of .net7. But, by doing so you may potentially require users to also upgrade their supported runtime, which may not be necessary at this time. The dependency of Microsoft.EntityFrameworkCore.SqlServer 6 does not correspond to .net 6 target framework and Microsoft.EntityFrameworkCore.SqlServer 7 does not correspond to .net 7 target framework. These target frameworks are not dependencies of each version of Microsoft.EntityFrameworkCore.SqlServer, but that is what is implied by your configuration. In fact, Microsoft.EntityFrameworkCore.SqlServer 6 and 7 are merely their own versions independent of the target framework. Microsoft.EntityFrameworkCore.SqlServer 7.0 is currently targeting .net6, so that is the target framework you should be supporting for best backward compatibility. Microsoft.EntityFrameworkCore.SqlServer is being upgraded with breaking changes and so your assembly should also be upgraded with this new dependency.

I couldn't get your projects to compile properly with your configuration, otherwise I wouldn't mention it here.

I think you have done an amazing job in developing this library. Before deciding upon your library I'd looked into linq2db.efcore and, like you, found their design problematic and, like you, I wanted an extension of EF Core. So, I'm grateful you created this library and that you are still servicing it.

Semantic Versioning 2 Dependency Change Version Handling: https://github.com/semver/semver/issues/148 https://softwareengineering.stackexchange.com/questions/440018/semantic-versioning-on-a-dev-dependencies-change https://stackoverflow.com/questions/53505965/semantic-versioning-and-dependency-changes

virzak commented 1 year ago

I'm not convinced .AsSubQuery() method should be copied here. The two reasons are:

Let me look into it a bit more and I agree with you, a use case like this should be in the README regardless of whether AsSubQuery used is copied or not.

Ephron-WL commented 1 year ago

I didn't reuse the namespace.

Ephron-WL commented 1 year ago

The truth is that your library is a good compliment, as is, to Thinktecture.EntityFrameworkCore and vice versa. So, if they are both dependencies of the same project then there would be no need to have the added feature.

virzak commented 1 year ago

I should have said "name collision" which makes it more relevant since global usings has been introduced.

john-bartu commented 1 year ago

Same problem here.

Thinktecture is not available for PostgresSql, so in my opinion this library should not be related to others. Is there any workaround for Npgsql?

Ephron-WL commented 1 year ago

I think I copied in the subqiuery logic from Thinktecture and submitted a PR, which was rejected by the library author because he considered it outside of the scope of the project. Take a look at my PR.

virzak commented 1 year ago

SubQuery() is not a good solution. I actually asked how a SubQuery should be implemented in Linq at .Net Data Community standup and was shutdown. Library / EF Core users shouldn't be aware any of that and it should be performed behind the scenes.

@john-bartu, been working on a solution to have that working automatically, almost got it, but got stuck and then distracted. Will try to finish this week. Sql Server users had a workaround, so it wasn't overly urgent until now. If you can't wait, use @Ephron-WL's fork.

Ephron-WL commented 1 year ago

If you found a way to implement the subquery without having to call a subquery function explicitly, then I salute you! 🏆🏆🏆

Ephron-WL commented 1 year ago

I might also add that I don't agree with developers shutting you down for suggesting the use of a subquery function. I mean, sure, we all want the system to do work for us without having to give it hints, e.g. there is implicit handling of the condition, but you and I both researched the possibilities and couldn't find any solution. The best was the use of the function, which I think is entirely legitimate and especially a minor implementation detail. There is the ideal, which many of us strive to obtain, and then there is what is practical and works now after an ideal was found not to be attainable. If you have figured out how to handle it implicitly in your library, then you are one major step ahead of my understanding and skillset in the matter and I commend that.

john-bartu commented 1 year ago

@virzak Thanks! Great I look forward to it.

In the meantime, I tried to write my own translation service to add one function EF.Functions.CountAll() just adding an additional column count(*) over(). Because this is the only function I need for now. But unfortunately, so far without success.

virzak commented 1 year ago

Almost there guys.

InspiringCode commented 6 months ago

Ist there any progress on this? In my opinion the AsSubQuery method is a pretty good and elegant solution in the case of window functions. Justification: When someone uses window functions they are already on the conceptual level of SQL specifics and window function return different values depending on where in my query/join I put them (if there are filters anywhere), so I want to be able to deliberately decide where (in which query/subquery) my window functions are placed.

I would really need the DENSE_RANK function in one of my projects and I need to be able to put it in a joined SQL subquery. Thinktecture doesn't support DENSE_RANK and this project doesn't support AsSubQuery and I really don't want to include both libraries... Is there any way out of this dilemma?

virzak commented 6 months ago

@InspiringCode , this weekend. Promise. If I won't get it this weekend, I'll make the PR public and ask for help. It is almost there, but doesn't work for some reason. There was a lot of work this holiday break on this project, but didn't get to this issue. Feel terrible for feeding people promises, but we're almost there.

virzak commented 6 months ago

Apologies to making you wait for a year, but this is now in a new beta.

Usage:

ctx.TestRows.Where(t => EF.Functions.DenseRank(EF.Functions.Over().OrderBy(t.Id / 10)) == 2)

There are currently 3 basic tests that are passing, but I'm sure more work will be needed.

This works for PostgreSQL, SQLite and SQL Server providers.

These tests aren't enough to cover my client project which was the driver for creation of this library, so I'm excited to add more tests very soon.

Feel free to submit PRs with failing basic tests of your own.

The approach for this was to rewrite expression:

From:

{[Microsoft.EntityFrameworkCore.Query.EntityQueryRootExpression].Where(t => (__Functions_0.Max(t.Id, __Over_1.PartitionBy((t.Id / 10))) == Convert(23, Nullable`1)))}

To:

{[Microsoft.EntityFrameworkCore.Query.EntityQueryRootExpression].Select(o => new MyAnon() {Original = o, P0 = __Functions_0.Max(o.Id, __Over_1.PartitionBy((o.Id / 10)))}).AsSubQuery().Where(w => (w.P0 == Convert(23, Nullable`1))).Select(b => b.Original)}

before query execution.

The starting point is here: https://github.com/zompinc/efcore-extensions/blob/ce835fbf3de012d6c92e510864ef1ab4942b7f77/src/Zomp.EFCore.WindowFunctions/Query/Internal/WindowFunctionsRelationalQueryTranslationPreprocessor.cs#L15-L17

The visitor:

If anyone has an opinion on the approach, I'd really appreciate. For example, it might be better to figure out if PushdownIntoSubquery should be invoked from SqlExpressions visitors.

cc: @roji

InspiringCode commented 6 months ago

Thank you for your efforts @virzak!

Please consider the following query (which is a very simplified version of my production query):

public class Employee {
    public Guid Id { get; set; }
    public string Name { get; set; }
    public string Department { get; set; }
    public decimal Salary { get; set; }
    public string Manager { get; set; }
}
var employeesWithRank = db
    .Set<Employee>()
    .Select(e => new {
        EmployeeId = e.Id,
        Rank = EF.Functions.RowNumber(e.Department, EF.Functions.OrderByDescending(e.Salary))
    });

var query = db.Set<Employee>()
    .Where(x => x.Manager == "John")
    .Join(employeesWithRank,
        e => e.Id,
        r => r.EmployeeId,
        (e, r) => new { Id = e.Id, Name = e.Name, SalaryRank = r.Rank });

It should return all employees whos manager is John with their salary rank (but their rank within their WHOLE department and NOT the rank within all subordinates of Jonn!).

So the expected query would be:

SELECT [e].[Id], [e].[Name], [t].[Rank] AS [SalaryRank]
FROM [Employee] AS [e]
INNER JOIN (
    SELECT [e0].[Id] AS [EmployeeId], DENSE_RANK() OVER(PARTITION BY [e0].[Department] ORDER BY [e0].[Salary] DESC) AS [Rank]
    FROM [Employee] AS [e0]
) AS [t] ON [e].[Id] = [t].[EmployeeId]
WHERE [e].[Manager] = N'John'

But the returned query is:

SELECT [e].[Id], [e].[Name], DENSE_RANK() OVER(PARTITION BY [e0].[Department] ORDER BY [e0].[Salary] DESC) AS [SalaryRank]
FROM [Employee] AS [e]
INNER JOIN [Employee] AS [e0] ON [e].[Id] = [e0].[Id]
WHERE [e].[Manager] = N'John'

Which is simply wrong. How can I solve this problem with your library?

roji commented 6 months ago

@virzak I'm interested in what you're doing here, but unfortunately I don't think I'll have time to look at the code in the near future. I will say that I'm currently working on various query architecture improvements that should hopefully make something like this much easier to do (making the SQL tree much more open to arbirtrary external manipulations, making it fully immutable...). CTE support is also high on my mental list, though this may not be done in time for 9.0 (we have a lot of planned work).

virzak commented 6 months ago

@InspiringCode,

Looks like that the work which was done for Where needs to be applied for Join as well. If a Window Function is detected inside a join - that's when PushdownIntoSubquery should be performed automatically for you.

The query you're currently getting can be available can be achieved using two methods (current 1 and current 2). One of them can be modified.

https://dotnetfiddle.net/SNiYpG

InspiringCode commented 6 months ago

@virzak But when I do it as you proposed in your dotnetfiddle, I would still have to add a reference to Thinktecture which I want to avoid. For this reason I proposed to make AsSubquery() public.

You would probably also have to perform PushdownIntoSubquery when a window function is applied within a subselect, and maybe there are quite a few other scenarios where the same applies:

var query = db.Set<Employee>()
    .Where(e => e.Manager == "John")
    .Select(e => new { 
         Id = e.Id, 
         Name = e.Name, 
         SalaryRank = db
             .Set<Employee>()
             .Select(r => EF.Functions.RowNumber(r.Department, EF.Functions.OrderByDescending(r.Salary)))
             .Single(r => r.Id == e.Id) })
virzak commented 6 months ago

@InspiringCode, I didn't save the fiddle, that's the source of confusion. Please look again. We'll make current 1 to produce exact SQL of desired.

virzak commented 6 months ago

@InspiringCode there you go:

https://dotnetfiddle.net/32T5Ux

No thinktecture and no .AsSubQuery(). Also desired Sql.

Still lots of tests to run, but this is the concept.

virzak commented 6 months ago

@InspiringCode, for your subselect could you provide a fiddle link?

virzak commented 5 months ago

@john-bartu @Ephron-WL really sorry for making you wait guys. The fix is in in the latest beta along with window functions inside a where clause as well as nested window functions. The AsSubQuery is automatic and isn't required for the cases in this issue.

Let me know if this works well for you.

john-bartu commented 5 months ago

@virzak Thank you for your time. I will have a look at it this week and report back to you with the results.

InspiringCode commented 5 months ago

@virzak Thank you for your efforts!

Regarding the subquery example: I thought about it in more detail and it is indeed hard to come up with a meaningful example. You would need to have double nested sub queries but in this case a Join is probably simpler and cleaner. So I guess we can forget about it for the moment.

virzak commented 5 months ago

@InspiringCode There is a similar issue with a subquery in the where clause, but it can also be solved with a join.

https://github.com/zompinc/efcore-extensions/blob/9efbb28a44d8e6f7263cc1c6d080ec8a6b683109/tests/Zomp.EFCore.WindowFunctions.Testing/SubQueryTests.cs#L159-L164

john-bartu commented 5 months ago

@virzak

Unfortunately for our production case, this still does not work.

I will try to prepare a test case, or at least a fiddle showing my usage in which this does not work.

In the meantime I'll just say that we are trying to use "EF.Functions.Count(EF.Functions.Over())" to determine the number of total rows before pagination but after filtering.

We have something like:

_context
     .Where(.....) // huge filtering
     .OrderByDescending(e => e.TimeModification)
     .Select( new EventRes { Id = id, Count = EF.Functions.Count(EF.Functions.Over()))
     .Skip(filter.Offset).Take(filter.ItemsPerPage)
     .Join(...) // many joins

The problem is, that the select containing count(*) over () is not considered as SubQuery.

The count(*) over() is putted to the top level of query, after skip and take, so it always return TAKE count.

More or less, our logic is supposed to be this:

We have groups, we filter them by members, order groups, paginate, and then take them with their members. Returning groups with joined members with information about the number of groups after filtering (but before pagination).

redwyre commented 5 months ago

I gave this a go with 8.0.19-beta but it didn't solve my case:

SELECT [e].[CattleMustered], [e].[TimeTaken], ROW_NUMBER() OVER(ORDER BY [e].[CattleMustered] DESC, [e].[TimeTaken]) AS [c], [e].[Id], ROW_NUMBER() OVER(ORDER BY [e].[CattleMustered] DESC, [e].[TimeTaken]) AS [c0]
          FROM [Entry1] AS [e]
          WHERE EXISTS (
              SELECT 1
              FROM [EntryPlayer] AS [e0]
              INNER JOIN [Players] AS [p] ON [e0].[SteamPlayerId] = [p].[Id]
              WHERE [e].[Id] = [e0].[EntryId] AND [p].[Name] = @__steamName_2)
          ORDER BY ROW_NUMBER() OVER(ORDER BY [e].[CattleMustered] DESC, [e].[TimeTaken])
          OFFSET @__p_3 ROWS FETCH NEXT @__p_4 ROWS ONLY

I have a where after the row_number but it's being hoisted inside (it's through a joiner table)

virzak commented 5 months ago

@redwyre,

This is slightly hard to follow.

Can I ask to provide a fiddle with expected SQL? Will try to fix this week.

Thanks

tsanton commented 2 months ago

Hi @virzak !:) Any progression being made on this? I'm using the RowNumber for a "nasty" query of mine that breaks in main but works in pre-release. Hoping to see this make the main branch :)

If required I'm happy to show how it works for me, but I can't speak for how it's supposedly not working ref. @john-bartu, & @redwyre .

/T

virzak commented 2 months ago

This change is already in the master branch. Since it works for you, there should be no problem. When everyone else is satisfied and the test cases improve, I'll drop the prerelease tag.

sankarkumar23 commented 2 weeks ago

This is an excellent library. I've been seeking window functions in EF for the past 10 years, and finally, I've found a promising initiative here.

WHERE clause on top of window function is working good now on this package

Can we introduce the FIRST_VALUE and LAST_VALUE functions?

I'll also try contribute if I get any free time. This is awesome work. Thanks a bunch for all the contributors.

sankarkumar23 commented 2 weeks ago

If I have multiple window functions in the query, it is considering only the first one

var queryWithRanks = query.Select(i => new
{
    i,
    DenseRank = EF.Functions.DenseRank(
                    EF.Functions.Over().PartitionBy(i.TicketNumber)
                        .OrderByDescending(i.PublishDate)),
    Rank = EF.Functions.Rank(
                    EF.Functions.Over().PartitionBy(i.TicketNumber)
                        .OrderByDescending(i.PublishDate)),
}).Where(x => x.DenseRank == 1).Select(x => x.i);

Rank is completely ignored from the query

Another potential issue Although DenseRank column is pushed inside to the inner query.. the joins are being applied to the outer query. What if I want to use joined table column inside the dense rank partition? This is becoming so complex it appears..

sankarkumar23 commented 2 weeks ago

This change is already in the master branch. Since it works for you, there should be no problem. When everyone else is satisfied and the test cases improve, I'll drop the prerelease tag.

@virzak - This approach isn't feasible. How difficult would it be to retain the user's query as-is as an inner query and simply add a new outer query to access the window function columns defined in the inner query for filtering purposes?