dotnet / efcore

EF Core is a modern object-database mapper for .NET. It supports LINQ queries, change tracking, updates, and schema migrations.
https://docs.microsoft.com/ef/
MIT License
13.68k stars 3.17k forks source link

New Provider: Overriding the generation of bool predicate for short circuiting #33838

Closed ronnyek closed 2 months ago

ronnyek commented 4 months ago

I'm getting to the remaining handful of tests to make sure pass.

I'm currently looking at TypeBinary_short_circuit. It seems like currently providing some arbitrary bool comparison in the linq ultimately gets generated as a parameterized predicate like

where @someParameter

I believe for this database, I need to generate where @someParameter = true. I know its essentially a predicate (.Predicate property) that is being added to a SelectExpression as a SqlParameterExpression.

I'm struggling to find where an arbitrary condition is being translated into that SqlParameterExpression (.Predicate), so that I can modify it to generate the appropriate sql and parameters.

Can anyone point me in the correct direction?

EF Core version: 6.x Database provider: (e.g. Custom) Target framework: (.NET 6.0) Operating system: Windows 11 IDE: (e.g. Visual Studio 2022 17.4)

maumar commented 4 months ago

parameters are generated in the ExtractParameters step in QueryCompiler class. What's happening there is we look at the query tree that comes to us from Linq:

DbSet<Order>()
    .Where(o => (<>c__DisplayClass159_0.customer is Order))

and we notice that the expression in Where is not correlated with any of the DbSets, and therefore can be evaluated and the value can be put into the parameter, since that fragment will evaluate to the same value, regardless of the contents of the database.

So by the time you reach QueryCompilationContext.CreateQueryExecutorExpression you are already dealing with a parameter. I would leave that as is. SqlServer has a very similar requirement, predicates need to come in a form of comparison

where @myBoolParam will not work, rather we need to generate the expression of shape where @myBoolParam == true

We do that in SearchConditionConvertingExpressionVisitor, as one of the last steps in the query processing. This way we can take advantage of simpler query structure and potentially detect some optimizations before we end up with the final SQL shape.

This approach could work in your case also. SearchConditionConvertingExpressionVisitor also converts away from comparisons where necessary, e.g. in the projection. Sql server doesnt allow things like

SELECT c.Id > 5 as greaterThanFive FROM customers as c

we need to transform it to CASE block - not sure if that's also the case for your provider - if not SearchConditionConvertingExpressionVisitor complexity can be drastically reduced.

ronnyek commented 4 months ago

I see ApplyConversion, and ConvertToSearchCondition. Is the concept in that expression visitor that _isSearchCondition is legitimate query criteria and if that's false, we'd need to handle comparison?

I'm trying to decide which expressions need to be handled, and it seems like that visitor class attempts to apply the conversion in lots of cases.

BuildCompareToExpression seems to be deciding whether the expression is a constant bool and comparing that with 1, or just comparing the expression with true. That all seems common sense.

I think it'd be easy enough to port something like BuildCompareExpression and ApplyConversion, but its unclear to me what isSearchCondition is doing. Sometimes it looks like its being backed up, then updated based on some criteria, then reset, some methods dont appear to set it at all

So Specific questions if you don't mind:

  1. Is there a specific type of expression I should be implementing this equivalent to apply conversion into for specifically this bool comparison, or should I just apply it in the Visit<Expression> methods where this is called in the SqlServer Implementation?
  2. Any information I can get on what _isSearchCondition is doing, and when it should be true false
maumar commented 4 months ago

this visitor is specific to Sql Server, and indeed we need to convert in many places (either from and to comparison expression). Visitor goes through every node and has built-in information at each step whether the result can/should be a comparison (isSearchCondition = true) or it can't be (isSearchCondition = false).

Consider the example:

var myPrm = true;
ctx.Customers.Where(x => myPrm).Select(x => new { MyBool1 = myPrm, MyBool2 = x.Id > 7})

First we go into VisitSelect (which represents SelectExpression for the entire query, rather than just linq Select) , searchCondition is set to false before we visit the projection. This means we expect things in the projection (at least at the top level) to not be comparison expressions (sql server doesn't allow that). We go visit the projection (new { MyBool1 = myPrm, MyBool2 = x.Id > 7})

first thing is a parameter - so we go to visitSqlParameter and call into ApplyConversion, with condition: false) I.e. parameterExpression itself is not a condition. _searchCondition is set to false as well, meaning all is good, we can just leave it as is.

Next is the x.Id > 7, we go into VisitSqlBinary. Before we start processing the node we save the _searchCondition value we got from the parent (i.e. what we expect to get from the binary expression once we are done processing it). Then we visit left and right, so that the logic is recursively applied (and this will modify _searchCondition state). When children are visited and processed, we recover the saved searchCondition value of the parent (which was set to false), The expected "type" of the binary expression depends on the operaor type. GreaterThan represents a condition, however what we need (i.e. what's stored in _searchCondition) is false. So ApplyConversion sees the discrepancy and converts x.Id > 7 to CASE WHEN x.Id > 7 THEN TRUE ELSE FALSE which is allowed by sql server. Now we are done with the projection, we go back to VisitSelect, before we process the predicate we set _isSearchCondition to true, again, what Sql Server requires, We start visiting the predicate which is also a parameter expression, that takes us to VisitSqlParameter.

However this time we expect a condition but the parameter itself is a value. ApplyConversion again sees the discrepancy and converts the parameter to myPrm == true

When it comes to the information where _isSearchCondition should be true/false, it all depends on your provider. I would say good way to test this is to write a bunch of tests that have bool constant or parameter values in random places, and see what throws an exception. The same goes for other way around - add some comparisons in random places and see if database complains.

ronnyek commented 4 months ago

I was actually able to implement this, and I think it mostly just works with one exception. I am continuing to experiment and if I come up with a fix, I'll put it here.

I've got a RegexMethodTranslator that will translate regex to <Column> REGEXP '<pattern>'. The implementation it looks like the SearchConditionConvertingExpressionVisitor is applying conversion and making it read like <Column> = 'True' and '<pattern>' = 'True'.

In debugging through the visitor, its parsing through it not as if it were a specific expression, but rather sees ColumnConstantExpression of the column name and sets that to true, and the regex pattern as a SqlConstantExpression.

The regex syntax doesn't require comparison to true for us.

Is this a case where I should actually create a RegexExpression type, and in VisitSelect, basically skip over the apply conversion if its that RegexExpression, or would there be an easier way to intercept it as something that is expected and not attempt the conversion?

Again... if I figure it out before getting any pointers, I'll update it here.

ronnyek commented 4 months ago

So I already have an expression type ComplextFunctionArgumentExpression that basically gets translated into a single expression and as far as I can think, should never apply a compare to for the individual children.

In VisitSelect I basically tried this

 _isSearchCondition = true;

 if (selectExpression.Predicate is ComplexFunctionArgumentExpression) // <-- added this check and set of _isSearchCondition
 {
     _isSearchCondition = false;
 }
 var predicate = (SqlExpression?)Visit(selectExpression.Predicate);
 changed |= predicate != selectExpression.Predicate;

This seems to resovle the problem, all the while letting all other tests passing. Wondering if this makes sense, or if there is another way I should achieve the same thing

maumar commented 2 months ago

@ronnyek that approach won't work if your predicate is a bit more complicated, e.g Where(x => x.Name != "Foo" && ComplextFunctionArgumentExpression )

Instead, you should switch _isSearchCondition = false inside a method that processes ComplextFunctionArgumentExpression, similar to what we do in SearchConditionConvertingExpressionVisitor.VisitSqlFunction This way it will stop the incorrect conversion irrespective of the shape of your predicate. But the general approach makes sense.