dotnet / efcore

EF Core is a modern object-database mapper for .NET. It supports LINQ queries, change tracking, updates, and schema migrations.
https://docs.microsoft.com/ef/
MIT License
13.68k stars 3.17k forks source link

Returning `SqlFragmentExpression` from `HasTranslation` doesn't work (switches to client evaluation) #32969

Open InspiringCode opened 8 months ago

InspiringCode commented 8 months ago

I have the following function mapping with HasTranslation:

public static partial class EntityFrameworkExtensions
{
    public static int TotalRowCount()
        => throw new NotSupportedException();

    public static void HasTotalRowCountFunction(this ModelBuilder mb)
    {
        mb
            .HasDbFunction(() => TotalRowCount())
            .HasTranslation(arguments =>
            {
                return new SqlFragmentExpression("COUNT(*) OVER()");
            });
    }
}

But when I use the TotalRowCount function in a projection, the SqlFragmentExpression is ignored and the TotalRowCount function is evaluated client-side (the NotSupportedException is thrown). When I return a SqlFunctionExpression instead, the correct SQL is generated and the SQL function is correctly called.

I am observing a similar behavior when using an `IMethodCallTranslator´ to do the same translation. But in this case another strange behavior can be observed: If my function has a parameter, the translator is called, if it has no parameters (provided by LINQ query), like the method above, the translator isn't even called.

I did look into the EF source code quite a bit but I am feeling kind of lost. Could you please give me a tip why this doesn't work, how I could possibly achieve this simple mapping or at least some hints where I could look in the source code.

I really appreciate any help. Thank you.

InspiringCode commented 8 months ago

I have found a way to achieve the desired result but it is VERY hacky:

public static void HasTotalRowCountFunction(this ModelBuilder mb)
{
    mb
        .HasDbFunction(() => TotalRowCount())
        .HasTranslation(arguments =>
        {
            return new SqlFunctionExpression(
                "COUNT",
                new SqlExpression[] { new SqlFragmentExpression("*) OVER(") },
                false,
                new[] { false },
                typeof(int),
                null);
        });
}

But I am afraid that evil ghosts will haunt me in my nights dreams for the rest of my life if I put this into production... So I would really prefer a cleaner solution for this.

roji commented 8 months ago

Any chance you can post a minimal console program that shows this happening?

alienwareone commented 8 months ago

Was trying to do the same by returing an SqlFragmentExpression directly but it will throw an InvalidOperationException: _'The LINQ expression '[expression]' could not be translated. Either rewrite the query in a form that can be translated, or switch to client evaluation explicitly by inserting a call to 'AsEnumerable', 'AsAsyncEnumerable', 'ToList', or 'ToListAsync'.'_

Returning a derived SqlExpression which then returns the SqlFragmentExpression works, but unfortunately not with subqueries.

Reproduction code that tries to workaround the NULLS LAST issue in PostgreSQL in a very hacky way 😉

#define PG_DESCENDING_WITH_NULLS_LAST_EXPRESSION

using Microsoft.EntityFrameworkCore;
using Microsoft.EntityFrameworkCore.Query;
using Microsoft.EntityFrameworkCore.Query.SqlExpressions;
using Microsoft.EntityFrameworkCore.Storage;
using Microsoft.Extensions.Logging;
using System;
using System.Collections.Generic;
using System.Linq;
using System.Linq.Expressions;

await using var dbContext = new ApplicationDbContext();
await dbContext.Database.EnsureDeletedAsync();
await dbContext.Database.EnsureCreatedAsync();

var sql1 = dbContext.Products
    .OrderBy(x => x.Discount.DescendingWithNullsLast())
    .ToQueryString();

Console.WriteLine(sql1);
// 👍
// SELECT p."Id", p."Discount", p."Name", p."Price"
// FROM "Products" AS p
// ORDER BY p."Discount" DESC NULLS LAST

var sql2 = dbContext.Products
    .Include(x => x.ProductAttributes)
    .OrderBy(x => x.Discount.DescendingWithNullsLast())
    .Skip(10)
    .Take(10)
    .ToQueryString();

Console.WriteLine(sql2);
// SELECT t."Id", t."Discount", t."Name", t."Price", p0."Id", p0."Name", p0."ProductId"
// FROM (                                                                     👇👇👇
//     SELECT p."Id", p."Discount", p."Name", p."Price", p."Discount" DESC NULLS LAST AS c
//     FROM "Products" AS p
//     ORDER BY p."Discount" DESC NULLS LAST
//     LIMIT @__p_0 OFFSET @__p_0
// ) AS t
// LEFT JOIN "ProductAttribute" AS p0 ON t."Id" = p0."ProductId"
// ORDER BY t.c, t."Id"

public sealed class ApplicationDbContext : DbContext
{
    public DbSet<Product> Products => Set<Product>();

    protected override void OnConfiguring(DbContextOptionsBuilder optionsBuilder) => optionsBuilder
        .UseNpgsql("Host=localhost;Username=postgres;Password=postgres;Database=EfCorePostgresqlNullsLast")
        .LogTo(Console.WriteLine, LogLevel.Debug)
        .EnableSensitiveDataLogging()
        .EnableDetailedErrors();

    protected override void OnModelCreating(ModelBuilder modelBuilder)
    {
#if PG_DESCENDING_WITH_NULLS_LAST_EXPRESSION
        modelBuilder
            .HasDbFunction(typeof(OrderByExtensions).GetMethod(nameof(OrderByExtensions.DescendingWithNullsLast), [typeof(int?)])!)
            .HasTranslation(e =>
            {
                var column = (ColumnExpression)e[0];
                return new PgDescendingWithNullsLastExpression(column, e[0].Type, e[0].TypeMapping);
            });
#else
        // System.InvalidOperationException: The LINQ expression 'DbSet<Product>().OrderBy(p => p.Discount.DescendingWithNullsLast())' could not be translated. Either rewrite the query in a form that can be translated, or switch to client evaluation explicitly by inserting a call to 'AsEnumerable', 'AsAsyncEnumerable', 'ToList', or 'ToListAsync'. See https://go.microsoft.com/fwlink/?linkid=2101038 for more information.
        modelBuilder
            .HasDbFunction(typeof(OrderByExtensions).GetMethod(nameof(OrderByExtensions.DescendingWithNullsLast), [typeof(int?)])!)
            .HasTranslation(e =>
            {
                var column = (ColumnExpression)e[0];
                return new SqlFragmentExpression($"""{column.TableAlias}."{column.Name}" DESC NULLS LAST""");
            });
#endif
    }
}

public sealed class Product
{
    public int Id { get; private set; }
    public required string Name { get; set; }
    public required int Price { get; set; }
    public required int? Discount { get; set; }
    public ICollection<ProductAttribute> ProductAttributes { get; private set; } = [];
}

public sealed class ProductAttribute
{
    public int Id { get; private set; }
    public required string Name { get; set; }
}

public sealed class PgDescendingWithNullsLastExpression(
    ColumnExpression column,
    Type type,
    RelationalTypeMapping? typeMapping)
    : SqlExpression(type, typeMapping)
{
    protected override void Print(ExpressionPrinter expressionPrinter)
    {
    }

    protected override Expression VisitChildren(ExpressionVisitor visitor)
    {
        return new SqlFragmentExpression($"""{column.TableAlias}."{column.Name}" DESC NULLS LAST""");
    }
}

public static class OrderByExtensions
{
    // Only works inside OrderBy() and NOT OrderByDescending()
    public static int? DescendingWithNullsLast(this int? _)
    {
        throw new NotImplementedException();
    }
}

.csproj

<Project Sdk="Microsoft.NET.Sdk">

  <PropertyGroup>
    <OutputType>Exe</OutputType>
    <TargetFramework>net8.0</TargetFramework>
    <LangVersion>preview</LangVersion>
    <Nullable>enable</Nullable>
    <WarningsAsErrors>nullable</WarningsAsErrors>
  </PropertyGroup>

  <ItemGroup>
    <PackageReference Include="Npgsql.EntityFrameworkCore.PostgreSQL" Version="8.0.0" />
  </ItemGroup>

</Project>
InspiringCode commented 7 months ago

@roji Here is a .NET Fiddle that shows the failing version and the dirty workaround hack.

PS: I also tried the workaround shown in the post of @alienwareone (with a derived SqlExpression returning a SqlFragment in VisitChildren), but it also does not work in my case. See this fiddle.

roji commented 7 months ago

Problem 1: Can't use SqlFragment directly

Problem 2: Projection gets screwed up for subquery

InspiringCode commented 7 months ago

But as this issue shows, there are in any case ways around the type mapping problem (i.e. having a SqlFunctionExpression that is later converted to SqlFragmentExpression). In other words, if users have decided to extend the query pipeline, they're already doing very advanced/risky stuff on their own, and blocking them via SqlFragmentExpression's type mapping isn't very useful.

@roji I too think that allowing SqlFragmentExpression would be the less confusing solution. My experience as API user was as follows:

I hope this short API experience story is of some help.

roji commented 7 months ago

@InspiringCode thanks for the added details - that's useful indeed. FWIW scenarios where users need to construct their own SQL trees for functions are quite rare/advanced, and ones where they need some sort of SQL construct which isn't already supported is even rarer - we hardly ever see people doing that. In general, integrating arbitrary "stuff" into the SQL tree is more complicated than it looks - as you can see from the above; it's simply not possible to just "make it work" in all scenarios etc. But allowing a type mapping specifically may be OK - I'll discuss with the team.

BTW I'd be interested in why you need to generate COUNT(*) OVER () in the first place.

InspiringCode commented 7 months ago

@roji Thank for taking the time to review the issue.

My use case is the following:

public static async Task<PagedResult<T>> ToPagedResultAsync<T>(
    this IQueryable<T> query,
    int page,
    int pageSize)
{
    var result = await query
        .Select(x => new
        {
            Item = x,
            TotalCount = TotalRowCount()
        })
        .Skip(page * pageSize)
        .Take(pageSize)
        .ToArrayAsync();

    return new PagedResult<T>(
        Items: result.Select(x => x.Item).ToArray(),
        Page: page,
        TotalCount: result.FirstOrDefault()?.TotalCount ?? 0
    );
}

This was a lot faster than issuing two separate query (the additional row count for complex queries was barely noticeable).

roji commented 7 months ago

Thanks for the added detail - yeah, hopefully one day EF will have support for window functions (#12747), at which point such hacks will no longer be needed. Though another possibly better way to handle this would be to allow batching LINQ queries (#10879) - the downside with the window function hack is that at least in theory, the database is recalculating the total rows for each row it returns (though this is hopefully optimized away under the hood).

InspiringCode commented 7 months ago

@roji Query batching doesn't solve the problem in this case because the round-trip is not the expensive thing. Running COUNT(*) over the actual query takes nearly as long as the actual query itself which is a few seconds in my case (times two).

I really hope EF will support window functions natively soon, because I think they are an essential tool in business applications.

roji commented 7 months ago

How is it possible that running COUNT() in a separate query would take more than `COUNT() OVER ()` which does the same?

InspiringCode commented 7 months ago

@roji When using COUNT(*) OVER() what SQL Server does is basically:

  1. Run the complex and expensive query
  2. Write all the results to a temp table (the Table Spool nodes in the query plan)
  3. Count the rows of the temp table
  4. Apply the paging to the temp table and return the rows of the current page

Here is a comparison of the actual execution plan (above: with COUNT(*) OVER(), below: plain query without any COUNT. Note that there are 3 Table Spool nodes but they are all the same node (same primary ID, just duplicated in the graphics):

count-over-query-plan-1

When I just run a normal COUNT(*) it takes basically as long as the actual query itself (around 6.2 seconds), which would give me around 12.5 seconds with batched query feature vs 7.5 seconds total with COUNT(*) OVER().

roji commented 7 months ago

Ah I see, the part I was missing was that the COUNT(*) OVER () was already over the result of a complex query, but before the paging; so that to get the same you'd have to redo the query (without paging) and do COUNT(*) over that. The repetition of the query is the problem here.

So thanks, that's a pretty convincing scenario for enabling some sort of support for window functions - I've added a note to https://github.com/dotnet/efcore/issues/12747#issuecomment-1933796780.

roji commented 7 months ago

Design decision: we think allowing SqlFragmentExpression to have a type mapping is reasonable; SqlFragmentExpression is a very advanced escape hatch, and it doesn't make much sense to specifically block that there (especially given that people are already working around that limitation as above).

Charlieface commented 6 months ago

@roji I too think that allowing SqlFragmentExpression would be the less confusing solution. My experience as API user was as follows:

  • I first learned that you can map CLR functions to database functions/procedures - that's great! I then found, that you can use HasTranslation for more advanced scenarios which really captured my attention.

  • After doing some research and looking at the examples in the official docs I concluded: ah, you have to return some kind of SqlExpression, so let's look what expressions there are and what might fit my purpose. Hmm, I just want to insert some static simple SQL. Hmm, SqlFragmentExpression seems to do exactly this, perfect, let's try it.

  • Hmm, strange, doesn't work. Ok, let's try something different and derive our own SqlExpression. Let's look what I would have to override. Ah, there is the Print method, that's exactly what I need, I just need to print a SQL fragment. So let's return my SQL from there.

  • Hmm, strange, that doesn't work either. Let's double check by returning a SqlFunctionExpression as in the docs. Hmm, that works. Ok, so let's copy the SqlFunctionExpression 1-to-1 and name it MySqlFunctionExpression. Ok, that doesn't work either. Conclusion: the actual SQL generation magic must happen somewhere else.

  • Ok, luckily we have all the source code of EF. Let's check out how this works. Let's do "Find all references" on SqlFragmentExpression and SqlFunctionExpression. Hmm, ok. It seems the actual statement is generated by the QuerySqlGenerator. Ok, now it's starting to get a bit complicated...

  • Ok, I surrender. Let's open an issue on GitHub and hope that some EF god has mercy to tell me my obvious mistake or misunderstanding.

I hope this short API experience story is of some help.

I've had the exact same experience with any expression not supported (in any minor way) by EF Core.

This is a perfect example of why the whole QuerySqlGenerator pipeline is completely broken from a design perspective. The only way to override any of it is to basically rebuild all the dependency hookups from scratch, and hope that noone else has done the same, and hope that the API doesn't change under you in the next release.

There should be exactly one place where the SQL is generated: in the Print function on each SqlExpression. This should mean that I can create arbitrary SqlExpression classes and override Print Update Equals and GetHashCode. the use a custom HasTranslation, and I should be good to go, without touching anything else in the pipeline.

Basically what I'm saying is: QuerySqlGenerator is a monolithic blob of code that is not easily extensible, and needs to be refactored,

roji commented 6 months ago

@Charlieface I'd advise getting to know the innards of EF and what it means to generate SQL in-depth before making such statements.

The problem with adding a new SQL expression is not QuerySqlGenerator at all; that's just a standard expression visitor that's trivial to extend, adding support for arbitrary new expression types. The problem is that SQL expression aren't just inserted into the tree and then printed - there's various intermediate processing around them.

As an example, EF performs "type mapping inference"; that means that when you write Where(b => b.Name.Substring(5) == name), the actual type mapping of name (is it an nvarchar(max)? varchar(x)) gets inferred from the left side of the comparison; on that left side, the type mapping bubbles up from b.Name through the Substring method call, to then be applied to the right side. In other word, when translating the Substring() method call, we need logic that knows where to bubble up the type mapping from. Any new expression type would need to have the same handling in order to work correctly. You can take a look at SqlExpressionFactory to learn more about this.

Another example is nullability processing; EF performs various compensation to make the SQL behave in the same way around NULLs as the original C# code does (although C# has 2-value logic and SQL has 3-value logic). This requires EF to know about each different expression type, and process it correctly; take a look at SqlNullabilityProcessor to know more.

In other words, the problem isn't QuerySqlGenerator, it's the fact that full support for a new expression type needs to be implemented in various stages of the EF query pipeline; all of these are extensible (at least to some extent), but the task isn't easy and requires in-depth knowledge of EF internals.

That's just the reality of a complex, feature-rich ORM such as EF - it's easy to think you want to just add a SQL expression, but that's just not how it works. We've also seen very few people actually try to add expression types, and the right way to deal with that is to implement the feature that those users are looking for (e.g. window functions), rather than for users to try to insert support for custom SQL expressions into the ORM.

Coder3333 commented 4 months ago

All I want to do is be able to reference the Sql Server CURRENT_USER command. If there were a built in way to do this, I would take it, but since there does not appear to be one, I tried to implement it as a SqlFragmentExpression. On paper it made so much sense, but of course it doesn't actually work. I will now try to implement it as a function, but it would have been so nice to not have to go through all of these extra steps.

What happened to https://github.com/dotnet/efcore/pull/33053? It looks like it was completed months ago, but still appears to not be in production.

roji commented 4 months ago

@Coder3333 we're indeed missing easy mapping of user-defined niladic functions (i.e. functions which have zero arguments and don't have parentheses). However, SqlFragmentExpression isn't needed - you can simply use HasTranslation to map directly to SqlFunctionExpression - see #33817 for the code.