dotnet / efcore

EF Core is a modern object-database mapper for .NET. It supports LINQ queries, change tracking, updates, and schema migrations.
https://docs.microsoft.com/ef/
MIT License
13.65k stars 3.15k forks source link

Parameterized Offset/Fetch makes query slower #28632

Open Singway opened 2 years ago

Singway commented 2 years ago

scenario

Recently I created an column store index to improve the performance. It was fine if I directly specify the input of Offset/Fetch. However, I found that when I executed the sql generated by EF which is parameterized, the performance got almost 10 times slower than the former. It could be a disaster if there are extra entities to join.

I wonder if there is an solution for my scenario or maybe is that an opportunity for EF? The Skip()/Take() only accept the int parameter, so I guess it is relatively safe and might be unnecessary to parameterize the input for Offset/Fetch.

samples

Parameterized Offset/Fetch with 1 row image image

Non-parameterized Offset/Fetch with 1 row image image

Non-parameterized Offset/Fetch with 900 row image image

version information

EF Core version: 3.1 Database provider: SQL Server in Azure

roji commented 2 years ago

@Singway thanks for this information.

Interestingly, the costs below seem identical for parameterized vs. constant, so even if the execution plan differs, I'm wondering if there's an actual difference. In other words, it makes sense that at planning time, the planner has more information in the constant case, since it knows up-front how many rows will be returned and doesn't require the parameter values of a specific query; but whether that actually impacts query performance is another question.

Any chance you could benchmark the two?

FYI the reason we parameterize Skip/Take is that if we don't, every differing value would cause a completely different query plan to be generated (so Skip(1) and Skip(2) would result in two separate plans); that can create its own performance problems.

Singway commented 2 years ago

@roji thank you for your kind reply. Sorry maybe I didn't demonstrate it clearly. It actually had an actual difference. Let's focus on the first and the second plan, we would end up finding that the elapses respectively were 1.506s and 0.167s with parameter and constant.

And you are right the SQL Server optimizer utilize those up-front info like statistics to adjust the execution plan.

I realized the reason why EF needs to parameterize those input but it dose impact the performance sometimes(at least in my case). I'm wondering if there will be a config that we could to decide whether to use the parameter.

roji commented 2 years ago

@Singway thanks for the additional info.

I'd like to investigate this, but we're currently heads down stabilizing the 7.0 release. Could I ask you to put together a minimal repro that shows the above happening, i.e. a simple table schema, some data, and the two queries with differing performance characteristics?

Singway commented 2 years ago

@Singway thanks for the additional info.

I'd like to investigate this, but we're currently heads down stabilizing the 7.0 release. Could I ask you to put together a minimal repro that shows the above happening, i.e. a simple table schema, some data, and the two queries with differing performance characteristics?

Sure. I've push the sample to a Repro. You could import the data with Backup.bacpac in the Folder Backup and then execute the script in the folder Script. The schema is not same as the screenshot I posted at first but it do the same thing which is extracting the clustered key with opertor Offset/Fetch. Hip and thx.

ajcvickers commented 2 years ago

/cc @roji

roji commented 1 year ago

@Singway I just tried to access the repro link you posted above, but I'm getting a 404. Can you please check and post the right link?

roji commented 1 year ago

Note suggestion in #31552 to re-introduce overloads of Skip/Take accepting a lambda, so that we can differentiate between parameters and constants. This is similar to how EF6 worked apparently, except that a non-lambda call to Skip/Take was interpreted as a constant rather than as a parameter.

Liero commented 1 year ago

Constants vs Parameters is not just about Skip/Take. Also a constant in Where() clause can make a big difference. In my case 2s vs Timeout (30s)

roji commented 1 year ago

@Liero Where accepts a a lambda expression, so EF sees an expression tree. This means that you can yourself control whether to use a lambda of a parameter:

// Constant:
_ = context.Blogs.Where(b => b.Name == "foo");

// Parameter
var name = "foo";
_ = context.Blogs.Where(b => b.Name == name);

Skip and Take, on the other hand, accept a value rather than a lambda, so there's no expression tree and no distinction between constant and parameter as above. This is why this problem is specific to Skip and Take (and Contains, incidentally).

Liero commented 1 year ago

Yeah, but that is not sufficient.

I want this

var name = "foo";
_ = context.Blogs.Where(b => b.Name == name);

to be translated as constant.

I currently have to write:

_ = name switch
{
    "foo" =>  context.Blogs.Where(b => b.Name == "foo");
    "bar" =>  context.Blogs.Where(b => b.Name == "bar");
    ...
}

I would prefer

context.Blogs.Where(b => b.Name == EF.Constant(name));
roji commented 1 year ago

@Liero you can also use the Expression builder API to dynamically construct the expression tree predicate passed to Where (i.e. use Expression.Constant); that should be an easier and much more robust way to integrate constant nodes in a part of the query.

However, we can use #31552 to track EF.Constant (or similar) as a nicer way to produce constant nodes without doing the above; that would be quite a specialized (and somewhat obscure) query feature, but I guess it makes sense.

Note that something like EF.Constant would not be usable with Skip/Take as they exist today, since as I wrote above, these accept an int parameter and not an expression tree; that means that if you put EF.Constant inside Skip/Take at the top-level query, it would simply get immediately evaluated and the result passed to Skip/Take. There's simply no way to make a constant vs. parameter distinction for Skip/Take (at least not when these methods are you used in the top-level, rather than nested within a lambda within the querY).

roji commented 9 months ago

Just to summarize the different possibilities here:

roji commented 1 month ago

Another user report: #34369