dotnet / efcore

EF Core is a modern object-database mapper for .NET. It supports LINQ queries, change tracking, updates, and schema migrations.
https://docs.microsoft.com/ef/
MIT License
13.68k stars 3.17k forks source link

EF Core 8 Query Translation Bug: Array operations with union #33258

Open elad-legit opened 6 months ago

elad-legit commented 6 months ago

Description

After upgrading to EF8 from EF6, one of our complex queries including array operations stopped working ( everything worked fine with EF6 ). After debugging, we narrowed the problem to array operation after using union or concat. ( writing the same query without using concat worked fine )

Entity Framework Query


public enum SomeEnum
{
    E1,
    E2,
}

public class Obj1
{
    public string Id { get; set; }
    public SomeEnum[] Types { get; set; }
}

public class Obj2
{
    public string Id { get; set; }
    public SomeEnum[] Types { get; set; }
}

var queryable = context.objs1.Select(_ => new { _.Id, _.Types })
     .Concat(context.objs2.Select(_ => new { _.Id, _.Types }))
    .Where(_ => _.Types.Contains(SomeEnum.E1))
    .ToList();

Stack traces

.Contains(__p_2))' could not be translated. Additional information: Translation of method 'System.Linq.Enumerable.Contains' failed. If this method can be mapped to your custom function, see https://go.microsoft.com/fwlink/?linkid=2132413 for more information. Either rewrite the query in a form that can be translated, or switch to client evaluation explicitly by inserting a call to 'AsEnumerable', 'AsAsyncEnumerable', 'ToList', or 'ToListAsync'. See https://go.microsoft.com/fwlink/?linkid=2101038 for more information.

EF Core version: 8.0.2 Database provider: Npgsql 8.0.2 Target framework: .NET 8.0

yair-legit commented 6 months ago

This also happens if we try using Union instead of Concat

roji commented 6 months ago

tl;dr confirmed, regression on EFCore.PG (the scenario wasn't working on other providers, since primitive collections weren't supported before 8.0). As a workaround, move the Where into the two set operation legs - while it doesn't look as good, it produces the same results and should have the same performance:

_ = context.objs1.Where(o => o.Ints.Contains(8)).Select(o => o.Ints)
    .Concat(context.objs2.Where(o => o.Ints.Contains(8)).Select(o => o.Ints))
    .ToList();

Bug analysis

For the query:

_ = context.objs1.Select(o => o.Ints)
    .Concat(context.objs2.Select(o => o.Ints))
    .Where(o => o.Contains(8))
    .ToList();

If we remove the Concat, the query works; the tree that comes out of preprocessing is:

[Microsoft.EntityFrameworkCore.Query.EntityQueryRootExpression].Where(o => Property(o, "Ints").AsQueryable().Contains(8)).Select(o => Property(o, "Ints"))

If we put the Concat, we get the following tree:

[Microsoft.EntityFrameworkCore.Query.EntityQueryRootExpression].Select(o => Property(o, "Ints")).Concat([Microsoft.EntityFrameworkCore.Query.EntityQueryRootExpression].Select(o0 => Property(o0, "Ints"))).Where(e => e.Contains(8))

In other words, if the Concat isn't there, the Select is moved forward (pending selector), and the Where is processed correctly - we end up with a queryable Contains. When the Concat is there, no such movement occurs, and the Contains remains an enumerable Contains (instead of queryable, this is why we fail the translation).

I'm not sure why we end up with an enumerable Contains when there's a set operation - @maumar any clue?

Minimal runnable repro ```c# await using var context = new BlogContext(); await context.Database.EnsureDeletedAsync(); await context.Database.EnsureCreatedAsync(); var queryable = context.objs1.Select(o => o.Ints) .Concat(context.objs2.Select(o => o.Ints)) .Where(o => o.Contains(8)) .ToList(); public class BlogContext : DbContext { public DbSet objs1 { get; set; } public DbSet objs2 { get; set; } protected override void OnConfiguring(DbContextOptionsBuilder optionsBuilder) => optionsBuilder .UseNpgsql("Host=localhost;Username=test;Password=test") .LogTo(Console.WriteLine, LogLevel.Information) .EnableSensitiveDataLogging(); } public class Obj1 { public string Id { get; set; } public int[] Ints { get; set; } } public class Obj2 { public string Id { get; set; } public int[] Ints { get; set; } } ```
roji commented 6 months ago

@elad-legit as a side comment - we use upvotes to gauge how many of our users actually run into an issue, and having lots of people around you voting for it creates a false image of that (it seems that all 20 votes above come from your company). Can you please refrain from doing that?

elad-legit commented 6 months ago

@elad-legit as a side comment - we use upvotes to gauge how many of our users actually run into an issue, and having lots of people around you voting for it creates a false image of that (it seems that all 20 votes above come from your company). Can you please refrain from doing that?

Yes, sorry about that.

Unfortunately, our actual use case is much more complicated, and we can't easily move the where before the Concat. Thanks for the very quick response!

yair-legit commented 6 months ago

@maumar Is there any timeline we should expect regarding this issue?

maumar commented 5 months ago

@elad-legit @yair-legit Sorry for late replay. We are absolutely swamped with investigations regular dev work and other stuff, so response times are way longer than usual. As a workaround you can apply Where operation inside Concat for both sources separately:

        var queryable = context.objs1.Select(_ => new { _.Id, _.Types }).Where(_ => _.Types.Contains(SomeEnum.E1))
            .Concat(context.objs2.Select(_ => new { _.Id, _.Types }).Where(_ => _.Types.Contains(SomeEnum.E1)))
            //.Where(_ => _.Types.Contains(SomeEnum.E1))
            .ToList();

this produces the following sql:

SELECT [o].[Id], [o].[Types]
FROM [objs1] AS [o]
WHERE 0 IN (
    SELECT [t].[value]
    FROM OPENJSON([o].[Types]) WITH ([value] int '$') AS [t]
)
UNION ALL
SELECT [o0].[Id], [o0].[Types]
FROM [objs2] AS [o0]
WHERE 0 IN (
    SELECT [t0].[value]
    FROM OPENJSON([o0].[Types]) WITH ([value] int '$') AS [t0]
)

which is not ideal, but should unblock the scenarios.

maumar commented 5 months ago

Problem is in nav expansion. For the case without Concat, when processing source of Where method, we break that part down to QueryRootExpression (objs) and pending selector of member access to Ints primitive collection. Then, during ExpandNavigationsForSource step, we process the pending selector, recognize that it's a primitive collection and produce PrimitiveCollectionReference. Then, during ExpandNavigationsForSource visitation we have code to process primitive collection, that appends AsQueryable call and encases everything in NavigationExpansionExpression

In the Concat case, nav expansion is not smart enough to extract pending selector (we couldn't do it even if we knew how - can't Concat two different sources, so the projections need to stay inside concat). We end up with source being QueryRoot.Select(x => x.Ints).Concat(...) and the pending selector is just identity projection. Then, ExpandingExpressionVisitor doesn't create PrimitiveCollectionReference, and the visitation thinks we it's dealing with regular NavigationTreeExpression. The AsQueryable() that we put there earlier, during NormalizeQueryableMethod gets gobbled up (as nav expansion thinks it's not needed) and we end up with the issue.

Ultimately it's a pending selector issue (https://github.com/dotnet/efcore/issues/32957)

AlmightyLks commented 5 months ago

I have stumbled upon the same bug myself (I think?), and luckily found this issue before filing my own 😄

As i understood, it is an EFCore Problem, not a DB-Adapter Problem? Like Postgres, SQL Server, SQLite, etc. as I was able to reproduce it in the latter two

My own MRE can be found at here

The Bug is especially problematic, because our Where condition is a Contains() with a list parameter consisting of a few entries.. Which we would like to not repeat up to 6 times within our IQueryable, as it generates a query that is too complex for SQL Server 😅

roji commented 5 months ago

@AlmightyLks your link leads to a 404, is it maybe a private repo that needs to be made public?

AlmightyLks commented 5 months ago

@roji Ah yes, sorry. Can you try again?

roji commented 5 months ago

@AlmightyLks yes, from what I can tell your case is identical to what has already been reported above - I'm minimizing these comments as I don't think there's new information here.