dotnet / efcore

EF Core is a modern object-database mapper for .NET. It supports LINQ queries, change tracking, updates, and schema migrations.
https://docs.microsoft.com/ef/
MIT License
13.73k stars 3.17k forks source link

Query: potential bug in null semantics VisitSqlBinary #30247

Open maumar opened 1 year ago

maumar commented 1 year ago

SqlNullabilityProcessor.VisitSqlBinary as last step for AndAlso/OrElse performs SimplifyLogicalSqlBinaryExpression. However, before that happens we assign the nullability of the result based on nullability of left and right after the visitation. It is theoretically possible that SimplifyLogicalSqlBinaryExpression optimizes out the expression into constant true or false, however we don't update the resulting nullability, which can throw off further null processing. We should make nullability processor more robust by updating nullability of the result when the expression get simplified to a true/false, and in general always keep the expected nullability of every expression in sync with the expression itself.

Predicate that could cause the problem:

((DbFunctions
    .Like(
        matchExpression: e3.Value, 
        pattern: (e0.Value + "B")) && False) != False)

if we can somehow sneak it through the compiler (it optimizes it to 0 != 0 before we even get to EF processing).

ranma42 commented 3 months ago

@maumar IIUC in the case you describe the analysis is returning correct (inaccurate) nullability information; instead of reporting the more precise result (value: constant C, nullable: false) it returns (value: constant C, nullable: true).

This should likely not be much of a problem as the result is still sound. In order to make the query pipeline more robust it should be able to work just fine with everything marked as nullable (this would regress the queries, but ideally should not break anything). OTOH propagating the updated nullability information might make it possible to have more efficient queries.

Note that if the nullability analysis determined (incorrectly) that an expression is non-nullable and then the actual query returned NULLs, then you would have bugs (caused by an unsound analysis, i.e. an analysis that "proved" something false).

maumar commented 2 months ago

I agree, there shouldn't be risk of functional errors (i.e. data corruption) but potential missed optimizations. I'm not too concerned about this, but filed an issue to track this "just in case" thinking we may want to address it "in the future". But I agree its a very low priority issue.