dotnet / efcore

EF Core is a modern object-database mapper for .NET. It supports LINQ queries, change tracking, updates, and schema migrations.
https://docs.microsoft.com/ef/
MIT License
13.73k stars 3.18k forks source link

Hashset contains becomes a traditional IN statement instead of the new OpenJSON parameter #32365

Closed ngelotte closed 4 months ago

ngelotte commented 11 months ago

File a bug

If you use a HashSet for the parameter which is used by contains it generates the old style non parameterized query

HashSet<string> filterData = ["Thing1", "Thing2"];
var results =  db.TableOfStuff.Where(t => filterData.Contains(t.Item)).ToList();

The following code generates the new parameterized query but it seems like an unnecessary ToList and an easy place to miss adding the ToList and end up with a performance degradation.

HashSet<string> filterData = ["Thing1", "Thing2"];
var results =  db.TableOfStuff.Where(t => filterData.ToList().Contains(t.Item)).ToList();

Include provider and version information

EF Core version: 8.0 Database provider: Microsoft.EntityFrameworkCore.SqlServer Target framework: .Net 8.0 Operating system: Windows IDE: Visual Studio 17.8

AlexeiScherbakov commented 11 months ago

I also encoutered this bug. I my case it's broke query with exception: Microsoft.Data.SqlClient.SqlException : Cannot perform an aggregate function on an expression containing an aggregate or a subquery.

Expression part is:

string[] shippedStatuses = new[]
{
    "Delivered",
    "Shipped",
    "Net"
};
...
((x.OrdersNetPCS > 0)
|| ((x.OrdersApprovedPCS > 0) && shippedStatuses.Contains(x.StatusNameLine)))

SQL is generated:

SELECT 1
                FROM OPENJSON(@__shippedStatuses_2) WITH ([value] nvarchar(max) '$') AS [s]
                WHERE [s].[value] = [e].[StatusNameLine] OR ([s].[value] IS NULL AND [e].[StatusNameLine] IS NULL)))

Old version (6 and 7) generates normal SQL:

[e].[OrdersNetPCS] > 0.0 OR ([e].[OrdersApprovedPCS] > 0.0 AND [e].[StatusNameLine] IN (N'Delivered', N'Shipped', N'Net'))
roji commented 11 months ago

@AlexeiScherbakov that doesn't seem related to anything in this issue, and also doesn't contain enough details to reproduce the problem.

Can you please open a new issue with a minimal, runnable repro showing the failure?

AlexeiScherbakov commented 11 months ago

@roji done - https://github.com/dotnet/efcore/issues/32374

roji commented 11 months ago

Thanks for reporting this, I've confirmed the bug and will investigate.

Repro ```c# await using var ctx = new BlogContext(); await ctx.Database.EnsureDeletedAsync(); await ctx.Database.EnsureCreatedAsync(); var ids = new HashSet { 1, 2 }; _ = ctx.Blogs.Where(x => ids.Contains(x.Id)).ToList(); public class BlogContext : DbContext { public DbSet Blogs { get; set; } protected override void OnConfiguring(DbContextOptionsBuilder optionsBuilder) => optionsBuilder .UseSqlServer(@"Server=localhost;Database=test;User=SA;Password=Abcd5678;Connect Timeout=60;ConnectRetryCount=0;Encrypt=false") .LogTo(Console.WriteLine, LogLevel.Information) .EnableSensitiveDataLogging(); } public class Blog { public int Id { get; set; } public string? Name { get; set; } } ```
dggmez commented 11 months ago

Came here to report this. It seems it works correctly with List, IEnumerable and arrays. It doesn't work with HashSet, ImmutableList, ImmutableArray, ImmutableHashSet, FrozenSet, etc.

ibrahim33 commented 9 months ago

Hello everyone. Is there an ETA when this will be released? We are currently using a custom implementation of OPENJSON and we cannot remove it nor upgrade to EF 8 because of the bug.

LINQ with HashSet:

var productSets= await dbContext.ProductSets()
            .Where(ps => ids.ToHashSet().Contains(ps.ProductId))
            .ToListAsync(cancellationToken)

QUERY with HashSet:

SELECT [p].[Id], [p].[ProductId], [p].[ReactivateIfSubProductsReappear], [p].[Version], [p0].[Id], [p0].[IsMainProduct], [p0].[ProductId], [p0].[ProductSetId], [p0].[Quantity]
FROM [externaldata].[ProductSet] AS [p]
LEFT JOIN [externaldata].[ProductSetProduct] AS [p0] ON [p].[Id] = [p0].[ProductSetId]
WHERE [p].[ProductId] IN (CAST(213171 AS bigint), CAST(234107 AS bigint), CAST(254039 AS bigint), CAST(252018 AS bigint), CAST(257030 AS bigint), CAST(282759 AS bigint), CAST(290215 AS bigint), CAST(352284 AS bigint), CAST(354318 AS bigint), CAST(354179 AS bigint))
ORDER BY [p].[Id]

LINQ with List:

var productSets= await dbContext.ProductSets()
            .Where(ps => ids.ToList().Contains(ps.ProductId))
            .ToListAsync(cancellationToken)

QUERY with List:

SELECT [p].[Id], [p].[ProductId], [p].[ReactivateIfSubProductsReappear], [p].[Version], [p0].[Id], [p0].[IsMainProduct], [p0].[ProductId], [p0].[ProductSetId], [p0].[Quantity]
FROM [externaldata].[ProductSet] AS [p]
LEFT JOIN [externaldata].[ProductSetProduct] AS [p0] ON [p].[Id] = [p0].[ProductSetId]
WHERE [p].[ProductId] IN (
    SELECT [i].[value]
    FROM OPENJSON(@__ids_0) WITH ([value] bigint '$') AS [i]
)
ORDER BY [p].[Id]
roji commented 9 months ago

we cannot remove it nor upgrade to EF 8 because of the bug

Why is this blocking you from upgrading to EF 8? EF 8 simply behaves in the same way as EF 7 on this particular point.

ibrahim33 commented 9 months ago

we cannot remove it nor upgrade to EF 8 because of the bug

Why is this blocking you from upgrading to EF 8? EF 8 simply behaves in the same way as EF 7 on this particular point.

It is just because of the way we implemented our version of OPENJSON, which overrides some methods of the ef core sql generator and some of these methods no longer work the same way in EF 8 or are removed. We could certainly adjust the implementation to make it work, but we would rather not invest this time if this issue can be addressed in the near future. That is the reason I am asking for an ETA

roji commented 9 months ago

I'm not sure how this particular problem could be a blocker - if some methods which you're overriding changed between EF 7 and 8, then you'll need to change your implementation after this is fixed as well.

In any case, I definitely want to fix this for 9, but I think it's pretty unlikely we'll backport the fix to 8, since it's not a regression and we haven't had many complaints yet.

ibrahim33 commented 9 months ago

if some methods which you're overriding changed between EF 7 and 8, then you'll need to change your implementation after this is fixed as well.

It is not a blocker but we would then completely remove our custom implementation when this is fixed, instead of having a hybrid solution in which lists are handled by EF and Hashsets by us.

In any case, I definitely want to fix this for 9, but I think it's pretty unlikely we'll backport the fix to 8, since it's not a regression and we haven't had many complaints yet.

Thank you for the info.

ADNewsom09 commented 9 months ago

You can do .AsEnumerable().Contains( on the hashset and it will pass as an openjson call without the overhead.

It appears the issue is it is translating HashSet's overridden .Contains method differently from IEnumerable's or ICollections'

m-gasser commented 8 months ago

I'm not sure how this particular problem could be a blocker - if some methods which you're overriding changed between EF 7 and 8, then you'll need to change your implementation after this is fixed as well.

It probably is not a blocker for anyone except us. We are trying since many years to allow our developers to use EF Core in addition to our inhouse ORM. With EF Core 7 and custom optimizations for IN Statements and for https://github.com/dotnet/efcore/issues/30912 we could finally reach performance that was good enough to greenlight EF Core for most of our databases. Most of our .Contains() statements are with sets and not lists, so it is crucial for us to get the OpenJSON optimization there too. It would be fine, if we have to customize EF Core for that, like we did for EF Core 7. We have to keep some customization anyways for https://github.com/dotnet/efcore/issues/30912. Our problem is, we can no longer hook in at the same place, as we did with EFCore 7, because things are now handled differently. Any pointer where we can hook into EF Core 8 to reimplement the OpenJSON support for sets would be greatly appreciated!

In any case, I definitely want to fix this for 9, but I think it's pretty unlikely we'll backport the fix to 8, since it's not a regression and we haven't had many complaints yet.

We are aware that compared to EFCore 7 it is not a regression but a big improvement! It is however a regression compared to our customization that we have to do, to reach our performance targets. It would be quite unfortunate if we had to stay on EF Core 7 for another whole year (and ironically for the feature of .NET 8 we were the most excited about).

OndrejUzovic commented 8 months ago

Please do not fix this bug until #32147 is fixed. This bug acts as an important workaround for #32147.

m-gasser commented 8 months ago

@OndrejUzovic You can also use EF.Constant() as a workaround for that bug. This is a supported feature and will continue to work in future versions. This seems like the better workaround to use, no need to rely on another bug and trust that the other bug does not get fixed.

roji commented 8 months ago

@OndrejUzovic @m-gasser EF.Constant() is indeed a good workaround, but has quite a heavy performance overhead. However, it's very likely we'll provide a more first-class way to opt-out of OPENJSON translation for 9.0, so both these bugs will likely go away.

vas6ili commented 6 months ago

You can also use EF.Constant() as a workaround for that bug. This is a supported feature and will continue to work in future versions.

While I agree with this general statement, we find ourselves in a fairly unusual situation where we're in the process of migrating a fairly large application from EF6 to EF Core and have to support both ORMs for a some time before we sort out all the bugs/differences and the same LINQ expressions have to be translatable in both ORMs, in this case the HashSet<> workaround worked better for us.

cincuranet commented 4 months ago

Note to self (from design): We should handle Contains on any "ICollection", not only HashSet.

OndrejUzovic commented 4 months ago

Since by fixing this bug the workaround for #32147 will not work anymore. Are there any plans to fix also #32147?

roji commented 4 months ago

@OndrejUzovic yeah, we're definitely planning on fixing that (or providing a better workaround) for 9.0.