Open roji opened 5 years ago
Oh, I thought it was something I could do in my project. I'll try to put together a minimal project to share with you.
Hi @roji - this is some initial thoughts on this issue. Hope to follow up later with some more thorough analysis. Was brought here by a coworker who was stumped by slow queries in both EFCore and EF6, and blamed this issue.
For queries with includes, we currently generate joins with a subquery: -- @roji , original post in thread
and
Stepping back, his issue was https://github.com/dotnet/efcore/issues/17622#issue-489433134 opened purely to eliminate subqueries which are unnecessary, replacing them with mathematically equivalent SQL that performs JOINs directly without the subquery: [...example removed from quote...] The various subsequent discussions seem to widen this scope and bring in various other questions (at least in some cases) - single query vs. split query performance, or changes to the SQL (INNER instead of LEFT join) which would cause different data to be returned, and therefore would be incorrect. -- @roji
and
https://github.com/dotnet/efcore/issues/17622#issuecomment-966789915 -- @smitpatel
@roji If this issue is for cases where there are equivalent mathematical transforms, then I think I have a regression test to add to the test suite that I discovered today:
For example, if I have a schema like:
1 Transaction ... Has Many Account Allocations 1 Transaction ... Has 1 Transaction Type 1 Transaction ... Has 1 Warehouse 1 Transaction ... Has 1 Currency 1 Account Allocation ... Has 1 Account 1 Account ... Has 1 Group
return Set<AccountAllocation>()
.Where(a => !a.Deleted)
.Where(a => a.TransactionType.AffectsPosition)
.Where(a => a.Account.IsActive)
.Where(a => a.Account.Group.Id == accountGroupId)
.Where(a => a.Transaction.TransactDate > cutoffDate)
.Where(a => a.Transaction.Currency.Id == currencyCode)
.Where(a => !a.Transaction.Warehouse.IsTransferHub)
.Take(1) // new command inbetween - this forces a TOP (1) which alleviates some table scan pressure on the query
.Any();
then, in EF6, I get a query where the .Where
statements without free variables are generated as table joins, but the .Where
statements with free variables (accountGroupId
, cutoffDate
and currencyCode
) are left outer joined to all those table joins. This results in a really large number of read operations, because the EXISTS check is after the left outer join. Thus, I have to cheat and add Take(1) to simulate a TOP 1 to get this query to not be terrible on SQL Server. Without Take(1), on a table with > 1.5 million transaction entities in the cut off period, it will scan every one of the pages containing those transaction entities, and takes about 6 seconds to run. With Take(1), it has an initial SQL Server query plan compilation overhead for the first run (on the db server side), and after that, runs in 1 ms flat.
If you see my point, there seems to be broader pattern of how transitive navigation paths are handled. It looks like the previous merges by Smit linked to this issue deal with other scenarios, like OrderBy (query) and HasQueryFilter (model builder configuration) and Includes.
If the broader pattern is transitive navigation paths handling, then I see why @msmolka added a comment here as well. It's also not particularly clear to me why placement of free variables should alter the query, but it seems like it may. There's probably a good reason I do not see why this is happening.
@jzabroski - You are posting a SQL generated by EF6. We have no plans to make any changes to that. Which brings us to question - What is the generated query in EF Core. In the absence of any of those navigation being an owned navigation, all of them would expand on same level without causing subquery as EF6 did. So please run your query in EF Core and if that is not satisfactory to you, file new issue with detailed repro steps.
@smitpatel Should this be fixed in EFCore 7?
I'm using EFCore7 with Npgsql 7.0.3
I have a reproduction repo here: https://github.com/CharlieDigital/efcore-m2m
docker compose up -d
dotnet run
# Omit the | jq. if you don't want pretty print.
curl http://localhost:5256/init & curl http://localhost:5256/products | jq .
The generated query is as follows:
SELECT p.id, p.created_utc, p.name, t.product_id, t.media_id, t.id, t.created_utc, t.name
FROM products AS p
LEFT JOIN (
SELECT p0.product_id, p0.media_id, m.id, m.created_utc, m.name
FROM product_media AS p0
INNER JOIN media AS m ON p0.media_id = m.id
) AS t ON p.id = t.product_id
ORDER BY p.id, t.product_id, t.media_id
There's a branch with no explicit ProductMedia
relationship table: https://github.com/CharlieDigital/efcore-m2m/tree/variant/no-explicit-join
Result is the same.
This issue is in the Backlog milestone. This means that it is not planned for the next release (EF Core 8.0). We will re-assess the backlog following the this release and consider this item at that time. However, keep in mind that there are many other high priority features with which it will be competing for resources. Make sure to vote (👍) for this issue if it is important to you.
@roji That's a shame; this issue has serious performance implications beyond toy datasets and it may not be obvious to users unless they are dumping their SQL that this is the cause of the performance issues.
@CharlieDigital can you provide a clear, simple comparison and query plan showing the performance issue? That would help bump the priority of this. Note that I'm not saying we shouldn't do this - it would just help to have a clear, side-by-side comparison showing an exact difference.
Would also like to see this fixed. Besides the performance - it makes reading the generated sqls really hard. :-/
@roji on simple queries, the performance is as expected. We've observed the issue when performing deep ThenInclude
and ordered sub-queries. For example, an Order
that has multiple LineItem
s which reference a Product
which has a set of Media
that are ordered.
The workaround that we ended up using is to carry a discriminator field down as deep as practical and then performing a filter on the ThenInclude
using the discriminator. For example, consider a case where we have many Store
s in the DB.
We would add a StoreId
field to Media
so that we can filter the Media
so we can filter the Media
by the StoreId
in the ThenInclude
.
I've moved off of the project so I'll see if I can find some time to produce a sample project that demonstrates this issue, but we saw queries go from 16s down to sub-20ms on our dataset.
We've observed the issue when performing deep ThenInclude and ordered sub-queries. For example, an Order that has multiple LineItems which reference a Product which has a set of Media that are ordered.
That sounds like it could intersect with #29171, which is about removing the orderings from the SQL altogether.
I definitely think there's a potential perf issue here (simple joins instead of subquery joins), but one of the problems is that we haven't yet seen a clear, minimal sample backed by server plans and/or benchmarked query runtime that prove the difference. I definitely intend to investigate this at some point, but such a clear repro would certainly help prioritize this.
If you need a demonstration, I just ran an SQLite test, if anyone is interested. I simplified a structure we commonly use in production so it's not an adversarial example or anything like that, just fake data. I can simplify even more, formalize the test and pack it up if someone can provide me with requirements.
We very often have a three-level hierarchy of one-to-many mappings: each Lot has many Parts, and each of those Parts has a few Inspections. If you want to fetch a lot, you could just naively use context.Lots.Include(l=>l.Parts).ThenInclude(r=>r.Inspections).Where(l=>l.Id = 15)
. Unfortunately, this will generate a query which reads through the entire Parts and Inspections tables.
I populated my test DB with 100 lots totaling about 500 MB and that took over 3 seconds on my laptop. When I refactor the SQL to flat joins, it's done in less than 100 ms. The outputs are identical.
On SSDs this is a nuisance, but on HDDs it's completely unusable. In the past, to work around this, we resorted to hacks like manually querying just the lot, then the min and max IDs of parts inside the lot, then fetching all the parts between this ID range which belong to the lot.
Original SQL with SQLite's query plan:
SELECT "t"."Id", "t"."Barcode", "t0"."Id", "t0"."EndTime", "t0"."LotId", "t0"."StartTime", "t0"."Id0", "t0"."CaptureTime", "t0"."Message", "t0"."PartId", "t0"."Passed"
FROM (
SELECT "l"."Id", "l"."Barcode"
FROM "Lots" AS "l"
WHERE "l"."Id" = 33
LIMIT 1
) AS "t"
LEFT JOIN (
SELECT "p"."Id", "p"."EndTime", "p"."LotId", "p"."StartTime", "i"."Id" AS "Id0", "i"."CaptureTime", "i"."Message", "i"."PartId", "i"."Passed"
FROM "Parts" AS "p"
LEFT JOIN "Inspections" AS "i" ON "p"."Id" = "i"."PartId"
) AS "t0" ON "t"."Id" = "t0"."LotId"
ORDER BY "t"."Id", "t0"."Id"
QUERY PLAN
|--CO-ROUTINE t
| `--SEARCH l USING INTEGER PRIMARY KEY (rowid=?)
|--MATERIALIZE t0
| |--SCAN p
| `--SEARCH i USING INDEX IX_Inspections_PartId (PartId=?) LEFT-JOIN
|--SCAN t
|--SCAN t0 LEFT-JOIN
`--USE TEMP B-TREE FOR ORDER BY
Refactored SQL with SQLite's query plan:
SELECT "t"."Id", "t"."Barcode", "p"."Id", "p"."EndTime", "p"."LotId", "p"."StartTime", "i"."Id", "i"."CaptureTime", "i"."Message", "i"."PartId", "i"."Passed"
FROM (
SELECT "l"."Id", "l"."Barcode"
FROM "Lots" AS "l"
WHERE "l"."Id" = 33
LIMIT 1
) AS "t"
LEFT JOIN "Parts" AS "p" ON "t"."Id" = "p"."LotId"
LEFT JOIN "Inspections" AS "i" on "p"."Id" = "i"."PartId"
ORDER BY "t"."Id", "p"."Id"
QUERY PLAN
|--CO-ROUTINE t
| `--SEARCH l USING INTEGER PRIMARY KEY (rowid=?)
|--SCAN t
|--SEARCH p USING INDEX IX_Parts_LotId (LotId=?) LEFT-JOIN
|--SEARCH i USING INDEX IX_Inspections_PartId (PartId=?) LEFT-JOIN
`--USE TEMP B-TREE FOR ORDER BY
For queries with includes, we currently generate joins with a subquery:
We could simplify this to:
We should measure the execution perf difference between the above two. Even if there is no (significant) difference, we could still decide to do this for SQL simplicity.
Originally raised in https://github.com/aspnet/EntityFrameworkCore/issues/17455.