Open ErikEJ opened 4 years ago
Just because I would not know how to do that! Do you think it makes any difference performance wise?
Hey @ErikEJ,
Do you think it would be safe to remove the condition for checking argument length for Postgresql (Npgsql)?
Good job BTW!
Yes, looks like some source say short.MaxValue is the limit, but not sure about the ADO.NET provider - @roji ?
Sorry, am missing some context... What exactly are we talking about here? :)
Oh, OK. Yeah, PostgreSQL supports up to ushort.MaxValue (65535) parameters for each statement - but a batched command can have more than 65535 parameters (as long as no single statement does). What current argument length check do you have?
Current max for SQL Server is 2048, so it can be much higher for postgress
Yep, absolutely.
Sorry for raising the dead here a bit, But im trying to put it into context and this seemed a better location than the EF Core issue for it (if you think otherwise, let me know and ill mirror there):
You chose to update only the "bucketed" solution from divega's proposals - Is that solely because its more globally supported by the various DB vendors? Strictly in SQL Server (and Azure SQL), do you think the string_split or TVP solutions are preferable in the context of plan cache pollution?
I'm specifically more worried about using FromSql
/FromSqlInterpolated
if i can avoid it entirely.
I am actually writing a blog post about using string_split just now.
This solution is provider agnostic, string_split SQL Server only.
And when I wrote the code, TVF mapping was not available in EF Core, it is now, so no FromSql needed!
Then i'll be looking forward for that article and will take another look at TVFs (which i thought not relevant in this context for some reason). Thanks!
I just blogged: "Avoiding SQL Server plan cache pollution due to unparameterized Contains queries generated by EF Core" https://erikej.github.io/efcore/sqlserver/2021/11/22/efcore-sqlserver-cache-pollution.html
I had to use the IQueryableExtensions.In(), to support older SQL Server versions.
And had an issue when using a library that parses the table name from generated query (I am using multiple Interceptors). The issue is when IEnumerable<TKey> values
is empty.
It has the check if(!values.Any())
and returns queryable.Take(0)
.
When that call is used EF Core generates an subquery, and the library that I use (EFCoreSecondLevelCacheInterceptor) fails trying to fetch table name from query.
I resorted to return queryable.Where(x => true)
instead. And that seems to work after that. Posting this in case someone else has this problem.
Updated code of that extension method, according to that change:
public static IQueryable<TQuery> In<TKey, TQuery>(
this IQueryable<TQuery> queryable,
IEnumerable<TKey> values,
Expression<Func<TQuery, TKey>> keySelector)
{
if (values == null)
{
throw new ArgumentNullException(nameof(values));
}
if (keySelector == null)
{
throw new ArgumentNullException(nameof(keySelector));
}
if (!values.Any())
{
//.Where instead of .Take(0), cause that seem to produce "funky" SQL.
return queryable.Where(x => true);
}
var distinctValues = Bucketize(values);
if (distinctValues.Length > 2048)
{
throw new ArgumentException("Too many parameters for SQL Server, reduce the number of parameters", nameof(keySelector));
}
var predicates = distinctValues
.Select(v =>
{
// Create an expression that captures the variable so EF can turn this into a parameterized SQL query
Expression<Func<TKey>> valueAsExpression = () => v;
return Expression.Equal(keySelector.Body, valueAsExpression.Body);
})
.ToList();
while (predicates.Count > 1)
{
predicates = PairWise(predicates).Select(p => Expression.OrElse(p.Item1, p.Item2)).ToList();
}
var body = predicates.Single();
var clause = Expression.Lambda<Func<TQuery, bool>>(body, keySelector.Parameters);
return queryable.Where(clause);
}
I really like the solution proposed but can't seem to get the query below working. I also tried using String_Split solution but that defaults to nvarchar(4000) converting to varchar kills the db :) as db I am working on is using varchars
var childItemFilter = List
@Henery309 thanks for the kind words.
I do not see any attempt to use the In extension method in the code Shared. Please share a full repro and I will have a look.
Also, what does not working mean?
Thanks for helping Erik appreciate it a lot.
I have the following query which retrieves all projects that are assigned to staff members Because I am using a contains in the linq query sql is generating 1 sql plan for every staff member.
var dbcontet = new ProjectDbContext(); var query = dbcontext.Projects.Where(m=> m.StaffMembers.Any(s=> staffReferences.Contains(s.StaffReference)));
I tried using the extension method IN: query = query.In(staffReferences, s => s.StaffMembers.SelectMany(s => s.StaffReference));
But this fails with the following error message: System.InvalidOperationException: 'The LINQ expression 's => s.StaffReference' could not be translated. Either rewrite the query in a form that can be translated, or switch to client evaluation explicitly by inserting a call to 'AsEnumerable', 'AsAsyncEnumerable', 'ToList', or 'ToListAsync'. See https://go.microsoft.com/fwlink/?linkid=2101038 for more information.'
@Henery309 I need a complete runnable repo in order to help, not code fragments
@ErikEJ have prepared a running example: https://github.com/Henery309/EfCoreTest The runnable code uses sqllite as the database and outputs the generated sql to the console.
Using the following code: var projs = db.Projects.In(staffIds, m => m.StaffMembers.Single().Id).ToList();
The output I get is:
SELECT [p].[Id], [p].[Name] FROM [dbo].[Project] AS [p] WHERE (( SELECT TOP(1) [s].[Id] FROM [dbo].[StaffMember] AS [s] WHERE [p].[Id] = [s].[ProjectId]) = @__v_0) OR (( SELECT TOP(1) [s0].[Id] FROM [dbo].[StaffMember] AS [s0] WHERE [p].[Id] = [s0].[ProjectId]) = @__v_1)
multiple selects are added to where clause
I am trying to figure out how to get:
SELECT [p].[Id], [p].[Name] FROM [dbo].[Project] AS [p] WHERE EXISTS ( SELECT 1 FROM [dbo].[StaffMember] AS [s] WHERE ([p].[Id] = [s].[ProjectId]) AND ([s].[Id] = @param1 OR [s].Id == @param2))
@Henery309 Hmmm... This is made to solve a SQL Server specific issue with hardcoded paramters, and I see no hardcoded values in the SQL your are sharing. If you use SQLite, this may not be needed at all.
I think SQL Server does not have issus with the multiple SELECTs. I am not sure which implementation if the In method you are using, but there are some suggesions for collapsing ORs in the comments in the gist.
Any reason to generate so many ORs rather than to just put the parameter placeholders in the IN parentheses?