PawelGerr / Thinktecture.EntityFrameworkCore

These libraries extend Entity Framework Core by a few features to make it easier to work with EF and for easier integration testing or to get more performance in some special cases.
https://dev.azure.com/pawelgerr/Thinktecture.EntityFrameworkCore
BSD 3-Clause "New" or "Revised" License
61 stars 17 forks source link

Using RowNumber with Join produces wrong query with wrong results #48

Closed InspiringCode closed 7 months ago

InspiringCode commented 9 months ago

Given the following IQueryable:

var hierarchies = db.Set<Bom>()
    .Where(x => x.Project == "Project 1")
    .Select(h => new
    {
        h.BomId,
        Hierarchy = EF.Functions.RowNumber(
            h.ParentBomId, // PARTITION BY
            EF.Functions   // ORDER BY
                .OrderBy(h.SortCriterion == null ? 1 : 0)
                .ThenBy(h.SortCriterion)
                .ThenBy(h.BomId))
    });

When including the above query in a Linq Join operation:

db.Set<Bom>()
    .Join(hierarchies,
        x => x.BomId,
        h => h.BomId,
        (x, h) => new { BomId = x, Hierarchy = h!.Hierarchy })
    .Select(x => new { x.Bom, x.Hierarchy });

Then the following query SHOULD be generated:

SELECT [b].[BOM_ID] AS [BomId], [t].[Hierarchy]
FROM [eng].[BOM] AS [b]
INNER JOIN (
    SELECT 
      [b0].[BOM_ID], 
      ROW_NUMBER() OVER(PARTITION BY [b0].[Parent_BOM_ID] ORDER BY CASE
          WHEN [b0].[SortCriterion] IS NULL THEN 1
          ELSE 0
      END, [b0].[SortCriterion], [b0].[BOM_ID]) AS [Hierarchy]
    FROM [eng].[BOM] AS [b0]
    WHERE [b0].[PROJECT] = @__key_Project_0
) AS [t] ON [b].[BOM_ID] = [t].[BOM_ID]
WHERE [b].[BOM_ID] = CAST(1234 AS bigint)

But the following WRONG query is actually generated:

SELECT [b].[BOM_ID] AS [BomId], ROW_NUMBER() OVER(PARTITION BY [t].[Parent_BOM_ID] ORDER BY CASE
    WHEN [t].[SortCriterion] IS NULL THEN 1
    ELSE 0
END, [t].[SortCriterion], [t].[BOM_ID]) AS [Hierarchy]
FROM [eng].[BOM] AS [b]
INNER JOIN (
    SELECT [b0].[BOM_ID], [b0].[Parent_BOM_ID], [b0].[SortCriterion]
    FROM [eng].[BOM] AS [b0]
    WHERE [b0].[PROJECT] = @__key_Project_0
) AS [t] ON [b].[BOM_ID] = [t].[BOM_ID]
WHERE [b].[BOM_ID] = CAST(1234 AS bigint)

This is clearly wrong, because the expected query returns the ROW_NUMBER which the row with ID 1234 has within all the rows that have the same Project as the row with ID 1234. The second query however always returns 1 (remember: the window function is always applied to the result set of the FROM clause)!

Could you please fix this bug?

Is there any way to workaround this issue? I tried to put it as a corelated sub-select but couldn't find a way to force EF to generate an additional wrapping SELECT query (the BOM_ID filter was merged together with the Project filter resulting in a constant ROW_NUMBER of 1).

PawelGerr commented 9 months ago

Please try using the extension metod AsSubQuery() to make the query a sub-select (example).

This is a known issue and can only be fixed by rewriting the (whole) expression tree. I decided against rewriting the query and implemented the method AsSubQuery().