Support projecting different navigations (but same type) via conditional operator

dzyranov commented 2 months ago

File a bug

using System;
using System.Data;
using System.Linq;
using Microsoft.EntityFrameworkCore;

using var db = new BloggingContext();

db.Database.EnsureDeleted();
db.Database.EnsureCreated();

var ss = db.Blogs
    .Where(p => p.Parent != null)
    .Where(x => (x.Id == 0 ? x.Parent : x.Parent).Id != null)
    .ToList();

Console.WriteLine(ss.Count);

public class BloggingContext : DbContext
{
    public DbSet<Blog> Blogs { get; set; }

    protected override void OnConfiguring(DbContextOptionsBuilder options)
        => options
            .LogTo(Console.WriteLine, Microsoft.Extensions.Logging.LogLevel.Information)
            .UseSqlite($"Data Source=test.db");

    protected override void OnModelCreating(ModelBuilder modelBuilder)
    {
        modelBuilder.Entity<Blog>().HasData(new Blog { Id = 1, Url = "http://efcore/" });
        modelBuilder.Entity<Blog>().HasData(new Blog { Id = 2,  ParentId = 1, Url = "trigger ELSE -> NULL" });
        modelBuilder.Entity<Blog>().HasData(new Blog { Id = 3,  ParentId = 1, Url = "2" });
        modelBuilder.Entity<Blog>().HasData(new Blog { Id = 4,  ParentId = 1, Url = "3" });
    }
}

public class Blog
{
    public int Id { get; set; }
    public required string Url { get; set; }
    public int? ParentId { get; set; }
    public virtual Blog Parent { get; set; }
}

Include stack traces

Unhandled exception. System.InvalidOperationException: The LINQ expression 'DbSet<Blog>()
    .LeftJoin(
        inner: DbSet<Blog>(), 
        outerKeySelector: b => EF.Property<int?>(b, "ParentId"), 
        innerKeySelector: b0 => EF.Property<int?>(b0, "Id"), 
        resultSelector: (o, i) => new TransparentIdentifier<Blog, Blog>(
            Outer = o, 
            Inner = i
        ))
    .Where(b => b.Inner != null)
    .Where(b => (int?)b.Outer.Id == 0 ? b.Inner : b.Inner.Id != null)' could not be translated. Either rewrite the query in a form that can be translated, or switch to client evaluation explicitly by inserting a call to 'AsEnumerable', 'AsAsyncEnumerable', 'ToList', or 'ToListAsync'. See https://go.microsoft.com/fwlink/?linkid=2101038 for more information.
   at Microsoft.EntityFrameworkCore.Query.QueryableMethodTranslatingExpressionVisitor.Translate(Expression expression)
   at Microsoft.EntityFrameworkCore.Query.RelationalQueryableMethodTranslatingExpressionVisitor.Translate(Expression expression)
   at Microsoft.EntityFrameworkCore.Query.QueryCompilationContext.CreateQueryExecutor[TResult](Expression query)
   at Microsoft.EntityFrameworkCore.Storage.Database.CompileQuery[TResult](Expression query, Boolean async)
   at Microsoft.EntityFrameworkCore.Query.Internal.QueryCompiler.CompileQueryCore[TResult](IDatabase database, Expression query, IModel model, Boolean async)
   at Microsoft.EntityFrameworkCore.Query.Internal.QueryCompiler.<>c__DisplayClass9_0`1.<Execute>b__0()
   at Microsoft.EntityFrameworkCore.Query.Internal.CompiledQueryCache.GetOrAddQuery[TResult](Object cacheKey, Func`1 compiler)
   at Microsoft.EntityFrameworkCore.Query.Internal.QueryCompiler.Execute[TResult](Expression query)
   at Microsoft.EntityFrameworkCore.Query.Internal.EntityQueryProvider.Execute[TResult](Expression expression)
   at Microsoft.EntityFrameworkCore.Query.Internal.EntityQueryable`1.GetEnumerator()
   at System.Collections.Generic.List`1..ctor(IEnumerable`1 collection)
   at System.Linq.Enumerable.ToList[TSource](IEnumerable`1 source)
   at Program.<Main>$(String[] args)

Include provider and version information

EF Core version: 8.08 Database provider: Microsoft.EntityFrameworkCore.SqlLite Target framework: .NET 8.0 Operating system: Visual Studio 2022 17.4

dzyranov commented 2 months ago

Will it be fix ?

cincuranet commented 2 months ago

Possibly related #34589.

dzyranov commented 2 months ago

This working in ef6.. My project depends heavily on this

ranma42 commented 1 month ago

Possibly related #34589.

@cincuranet #34589 is this issue 🤔

cincuranet commented 1 month ago

You are right. So it's not possibly related, but surely related. ;)

But it doesn't matter. During a triage with @roji last week we concluded it (whatever it was) was not related.

ranma42 commented 1 month ago

IIUC the issue is that RelationalSqlTranslatingExpressionVisitor.VisitConditional requires that all of its sub-expressions are translated to SqlExpressions by Visit.

This is (hopefully) going to be true for the condition (as it should always be a boolean value), but the true/false expressions might transport much more complex types (they might be navigations to other objects/collections/new{} objects, ...).

I am unsure if this is 100% general, but distributing some of the operations applied to the conditional might help in many cases. In this case the transformation would be: (x.Id == 0 ? x.Parent : x.Parent).Id -> (x.Id == 0 ? x.Parent.Id : x.Parent.Id)

roji commented 1 month ago

Yeah, good points @ranma42.

First, C# itself constrains the result of the conditional operation to be the same .NET type (e.g. you can't have x ? 8 : "foo"). I think this means that there are two possible useful usages:

The exact same type is projected, but e.g. from different columns (x ? b.Id1 : b.Id2, where both Ids have exactly the same type). This has variants of scalars (with the Ids) and structural types (e.g. x ? b.ShippingAddress : b.BillingAddress).
We're projecting two different structural types which are in a hierarchy. Here, the actual type getting projected is the common superclass.

We should probably be concentrating on the 1st case here, possibly splitting the 2nd case out to another issue if needed.

@ranma42

I am unsure if this is 100% general, but distributing some of the operations applied to the conditional might help in many cases.

Yeah, that's possible... Though for the general case we still must support the basic translation here; for example, the results of the conditional expression may be projected out without anything being composed on top. And once the general case is done, I'm not sure lowering the member access into the conditional is good (the SQL becomes more complex with duplication...).

@dzyranov given this worked on EF6, could you please post the SQL you got there? Also, assuming your actual query above is a simplification (x.Id == 0 ? x.Parent : x.Parent - not useful), can you provide a bit more context on how you're using the conditional in your real application? Are different types of a hierarchy being projected, or something else?

ranma42 commented 1 month ago

Yeah, good points @ranma42.

First, C# itself constrains the result of the conditional operation to be the same .NET type (e.g. you can't have x ? 8 : "foo"). I think this means that there are two possible useful usages:

The exact same type is projected, but e.g. from different columns (x ? b.Id1 : b.Id2, where both Ids have exactly the same type). This has variants of scalars (with the Ids) and structural types (e.g. x ? b.ShippingAddress : b.BillingAddress).

We're projecting two different structural types which are in a hierarchy. Here, the actual type getting projected is the common superclass.

We should probably be concentrating on the 1st case here, possibly splitting the 2nd case out to another issue if needed.

Yes, the 2nd case would definitely be a further step.

I am unsure if this is 100% general, but distributing some of the operations applied to the conditional might help in many cases.

Yeah, that's possible... Though for the general case we still must support the basic translation here; for example, the results of the conditional expression may be projected out without anything being composed on top.

I believe that this issue should essentially affect collection operations such as .Where, .Take, and so on, which basically expect an expression that computes a scalar (boolean/number). Is there any other case you think would fail? (IIUC projecting the complex type would work just fine with some further computation and/or object materialization performed on the client side).

And once the general case is done, I'm not sure lowering the member access into the conditional is good (the SQL becomes more complex with duplication...).

I think that the opportunities for improving the SQL are unrelated to this. A slightly modified version of the original snippet (changed so that the conditional matches your example of using 2 different values in the branches):

// @nuget: Microsoft.EntityFrameworkCore.Sqlite -Version 8.0.8

using System;
using System.Data;
using System.Linq;
using Microsoft.EntityFrameworkCore;

using var db = new BloggingContext();

db.Database.EnsureDeleted();
db.Database.EnsureCreated();

db.Blogs
    .Where(x => (x.Id == 0 ? x.LeftChild.Id : x.RightChild.Id) != null)
    .ToList();

db.Blogs
    .Where(x => (x.Id == 0 ? x.LeftChild.Url : x.RightChild.Url) != null)
    .ToList();

db.Blogs
    .Where(x => (x.Id == 0 ? x.LeftChild : x.RightChild).Url != null)
    .ToList();

public class BloggingContext : DbContext
{
    public DbSet<Blog> Blogs { get; set; }

    protected override void OnConfiguring(DbContextOptionsBuilder options)
        => options
            .LogTo(Console.WriteLine, Microsoft.Extensions.Logging.LogLevel.Information)
            .UseSqlite($"Data Source=test.db");

    protected override void OnModelCreating(ModelBuilder modelBuilder)
    {
        modelBuilder.Entity<Blog>().HasOne(x => x.LeftChild);
        modelBuilder.Entity<Blog>().HasOne(x => x.RightChild);
    }
}

public class Blog
{
    public int Id { get; set; }
    public string Url { get; set; }
    public virtual Blog? LeftChild { get; set; }
    public virtual Blog? RightChild { get; set; }
}

This produces a first query that is:

      SELECT "b"."Id", "b"."LeftChildId", "b"."RightChildId", "b"."Url"
      FROM "Blogs" AS "b"
      LEFT JOIN "Blogs" AS "b0" ON "b"."LeftChildId" = "b0"."Id"
      LEFT JOIN "Blogs" AS "b1" ON "b"."RightChildId" = "b1"."Id"
      WHERE CASE
          WHEN "b"."Id" = 0 THEN "b0"."Id"
          ELSE "b1"."Id"
      END = 42

I agree that is inefficient because of the JOINs; ideally this should be

      SELECT "b"."Id", "b"."LeftChildId", "b"."RightChildId", "b"."Url"
      FROM "Blogs" AS "b"
      WHERE CASE
          WHEN "b"."Id" = 0 THEN "b"."LeftChildId"
          ELSE "b"."RightChildId"
      END = 42

but AFAICT this is unrelated to the conditional (EFCore does not simplify FK accesses; this seems to be #28077 and #33067).

The more relevant query is:

      SELECT "b"."Id", "b"."LeftChildId", "b"."RightChildId", "b"."Url"
      FROM "Blogs" AS "b"
      LEFT JOIN "Blogs" AS "b0" ON "b"."LeftChildId" = "b0"."Id"
      LEFT JOIN "Blogs" AS "b1" ON "b"."RightChildId" = "b1"."Id"
      WHERE CASE
          WHEN "b"."Id" = 0 THEN "b0"."Url"
          ELSE "b1"."Url"
      END = 'foo'

which looks mostly fine to me. An alternative query for this would be

      SELECT "b"."Id", "b"."LeftChildId", "b"."RightChildId", "b"."Url"
      FROM "Blogs" AS "b"
      WHERE CASE
          WHEN "b"."Id" = 0 THEN (SELECT "b0"."Url" FROM "Blogs" AS "b0" WHERE "b"."LeftChildId" = "b0"."Id")
          ELSE (SELECT "b1"."Url" FROM "Blogs" AS "b1" ON "b"."RightChildId" = "b1"."Id")
      END = 'foo'

but I believe this would actually be worse. I would definitely like to know what EF6 did in this case (and if @roji knows some trick to simplify the .Url query).

dzyranov commented 1 month ago

Use cases

ef 6

public WarehouseInvoice
{
  ...
  [Computed]
[NotMapped]
public virtual WarehouseItem Warehouse
{
    get
    {
        var warehouse = (WarehouseItem)null;

        if (this.IsIncome)
        {
            warehouse = this.ReceiverWarehouse;
        }
        else if (this.IsExpense)
        {
            warehouse = this.SenderWarehouse;
        }

        return warehouse;
    }
}
  ...
}

ef core 8

public WarehouseInvoice
{
  ...
  [Projectable]
  public WarehouseItem? Warehouse => this.IsIncome ? this.ReceiverWarehouse : (this.IsExpense ? this.SenderWarehouse : null);
  ...
}

Use

dataContext
  // 'Warehouse' computed field
  .Where(p => p.Invoice.Warehouse.Agent_id == agentId)

Sorry for my English

roji commented 1 month ago

@dzyranov we need the SQL.

roji commented 1 month ago

@ranma42 agree with everything you wrote...

but I believe this would actually be worse.

I agree... In general JOIN syntax is preferable than scalar subqueries - database planners are usually able to do more with them...

I think the ideal translation for your last query would be:

SELECT "b"."Id", "b"."LeftChildId", "b"."RightChildId", "b"."Url"
FROM "Blogs" AS "b"
LEFT JOIN "Blogs" AS "b0" ON CASE WHEN "b"."Id" = 0 THEN "b"."LeftChildId" = "b0"."Id" ELSE "b"."RightChildId" = "b0"."Id" END
-- LEFT JOIN "Blogs" AS "b0" ON "b"."LeftChildId" = "b0"."Id"
-- LEFT JOIN "Blogs" AS "b1" ON "b"."RightChildId" = "b1"."Id"
WHERE [b0].[Url] = 'foo'

That is, recognize that the join itself is conditional (but against the same table). Note https://github.com/dotnet/efcore/issues/33745#issuecomment-2356735655, where we're just now discussing allowing arbitrary predicates in join conditions. While that's easy enough to do, actuallyrecognizing the pattern and transforming it appropriately is likely to be quite non-trivial.

dzyranov commented 1 month ago

@dzyranov we need the SQL.

var q = dataContext
                .WarehouseInvoice
                .Select(p => (p.Id % 2 == 0 ? p.ReceiverWarehouse : p.SenderWarehouse).Agent_id)
                .Take(10);

SELECT "Alias1"."C1" FROM (SELECT CASE WHEN (0 = "Extent1"."Id" % 2) THEN ("Extent2"."Agent_id") ELSE ("Extent3"."Agent_id") END AS "C1" FROM "public"."WarehouseInvoice" AS "Extent1" INNER JOIN "public"."WarehouseItem" AS "Extent2" ON "Extent1"."ReceiverWarehouse_id" = "Extent2"."Id" INNER JOIN "public"."WarehouseItem" AS "Extent3" ON "Extent1"."SenderWarehouse_id" = "Extent3"."Id" LIMIT 10) AS "Alias1"

roji commented 1 month ago

@dzyranov thanks, this corresponds to what @ranma42 and I were discussing above, i.e. simply add a join for each leg of the conditional operator.

dotnet / efcore