Query: IQueryable in subquery sometimes triggers validation which prevents IQueryables in the final projection

maumar commented 3 years ago

query:

from l1 in ss.Set<Level1>()
                      orderby l1.Id
                      let inner = (from l2 in l1.OneToMany_Optional1
                                   where l2.Name != "Foo"
                                   let innerL1s = from innerL1 in ss.Set<Level1>()
                                                  where innerL1.OneToMany_Optional1.Any(innerL2 => innerL2.Id == l2.Id)
                                                  select innerL1.Name
                                   select innerL1s).FirstOrDefault()
                      select inner.ToList()

exception:

 The query contains a projection 'l1 => l1.OneToMany_Optional1
        .Where(l2 => l2.Name != "Foo")
        .Select(l2 => new { 
            l2 = l2, 
            innerL1s = DbSet<Level1>()
                .Where(innerL1 => innerL1.OneToMany_Optional1
                    .Any(innerL2 => innerL2.Id == l2.Id))
                .Select(innerL1 => innerL1.Name)
         })
        .Select(<>h__TransparentIdentifier0 => <>h__TransparentIdentifier0.innerL1s)
        .FirstOrDefault()' of type 'IQueryable<string>'. Collections in the final projection must be an 'IEnumerable<T>' type such as 'List<T>'. Consider using 'ToList' or some other mechanism to convert the 'IQueryable<T>' or 'IOrderedEnumerable<T>' into an 'IEnumerable<T>'.
  Stack Trace: 
    QueryableMethodNormalizingExpressionVisitor.VerifyReturnType(Expression expression, ParameterExpression lambdaParameter) line 168
    QueryableMethodNormalizingExpressionVisitor.VerifyReturnType(Expression expression, ParameterExpression lambdaParameter) line 147
    QueryableMethodNormalizingExpressionVisitor.VisitMethodCall(MethodCallExpression methodCallExpression) line 124
    MethodCallExpression.Accept(ExpressionVisitor visitor)
    ExpressionVisitor.Visit(Expression node)
    ExpressionVisitorUtils.VisitArguments(ExpressionVisitor visitor, IArgumentProvider nodes)
    ExpressionVisitor.VisitMethodCall(MethodCallExpression node)
    QueryableMethodNormalizingExpressionVisitor.VisitMethodCall(MethodCallExpression methodCallExpression) line 127
    MethodCallExpression.Accept(ExpressionVisitor visitor)
    ExpressionVisitor.Visit(Expression node)
    QueryTranslationPreprocessor.NormalizeQueryableMethod(Expression expression) line 84
    RelationalQueryTranslationPreprocessor.NormalizeQueryableMethod(Expression expression) line 45
    QueryTranslationPreprocessor.Process(Expression query) line 60
    RelationalQueryTranslationPreprocessor.Process(Expression query) line 54
    QueryCompilationContext.CreateQueryExecutor[TResult](Expression query) line 174
    Database.CompileQuery[TResult](Expression query, Boolean async) line 72
    QueryCompiler.CompileQueryCore[TResult](IDatabase database, Expression query, IModel model, Boolean async) line 114
    <>c__DisplayClass9_0`1.<Execute>b__0() line 98
    CompiledQueryCache.GetOrAddQuery[TResult](Object cacheKey, Func`1 compiler) line 78
    QueryCompiler.Execute[TResult](Expression query) line 94
    EntityQueryProvider.Execute[TResult](Expression expression) line 81
    EntityQueryable`1.GetEnumerator() line 93
    List`1.ctor(IEnumerable`1 collection)

maumar commented 3 years ago

JoasE commented 3 years ago

I experienced the same problem with the query below:

var defaultLanguageId = 1;
var selectedLanguageId = 2;

var query = from defaultTranslation in context.Translations
            where defaultTranslation.LanguageId == defaultLanguageId
            let translation = from translation in context.Translations
                                where translation.LanguageId == selectedLanguageId &&
                                    translation.Key == defaultTranslation.Key
                                select translation
            from selectedTranslation in translation.DefaultIfEmpty()
            select selectedTranslation ?? defaultTranslation;
var translations = originalQuery.AsNoTracking().ToList();

A possible workaround would be:

var query = from defaultTranslation in context.Translations
            where defaultTranslation.LanguageId == defaultLanguageId
            join selectedTranslation in context.Translations.Where(x => x.LanguageId == selectedLanguageId)
                on defaultTranslation.Key equals selectedTranslation.Key into ts
            from selectedTranslation in ts.DefaultIfEmpty()
            select selectedTranslation ?? defaultTranslation;

var translations = query.AsNoTracking().ToList();

This will result in the same SQL being executed using the SqlServer database provider 5.0.3

JoasE commented 3 years ago

@ajcvickers @maumar This issue is currently (part of) the reason my company is delaying the update to EF Core 5 for one of our projects. It seems like something which should be working in EF Core 5 (aswell as in 6). Is there any way this could be added to the 5.x milestone?

ajcvickers commented 3 years ago

@maumar to investigate patch

maumar commented 3 years ago

@JoasE easier workaround: add ToList() at the end of subquery defined in the "let":

var defaultLanguageId = 1;
var selectedLanguageId = 2;

var query = from defaultTranslation in context.Translations
            where defaultTranslation.LanguageId == defaultLanguageId
            let translation = (from translation in context.Translations
                                where translation.LanguageId == selectedLanguageId &&
                                    translation.Key == defaultTranslation.Key
                                select translation).ToList()
            from selectedTranslation in translation.DefaultIfEmpty()
            select selectedTranslation ?? defaultTranslation;
var translations = originalQuery.AsNoTracking().ToList();

maumar commented 3 years ago

Note (Smit's comment on the initial PR for the fix):

With pending selector in the middle coming in, we should just add a "fake wrapper" during pre-translation phase to match types and remove them during projection if it gets swallowed. If it doesn't then leave it in and shaper compiler can throw for it. Like how we do for NewExpression & constants.

maumar commented 3 years ago

@JoasE we decided this issue doesn't meet the bar for a patch, sorry. The workaround is quite straightforward. The exception message indicates which subquery fails the validation (although it incorrectly states its a top level projection). Just need to add ToList call around that subquery and it should work.

JoasE commented 3 years ago

@maumar Thanks for looking into this! I'll discuss this with my team

astorDev commented 3 years ago

@maumar can you elaborate on why it "doesn't meet the bar"? This completely breaks existing behaviour, therefore makes update to the latest ef core version painful and risky. If it's not the reason for patch I don't know what is

smitpatel commented 3 years ago

Blocked on #20291

mwwhited commented 1 year ago

@JoasE we decided this issue doesn't meet the bar for a patch, sorry. The workaround is quite straightforward. The exception message indicates which subquery fails the validation (although it incorrectly states its a top level projection). Just need to add ToList call around that subquery and it should work.

Do you realize how many millions of lines of entity queries this change broke?

We only just now got far enough down fight though breaking changes to get mostly on to .Net 6.0 since .Net Core 3.1 is deprecated. We are looking at having to manually retest and possibility rewrite several years worth of work.

mwwhited commented 1 year ago

Simple examples

            var notBroken = from i in new[] { 1 }.AsQueryable()
                            let subQuery = from j in new[] { 2 }.AsQueryable().ToList()
                                           select j
                            select new
                            {
                                i,
                                count = subQuery.Count()
                            };

            var broken = from i in new[] { 1 }.AsQueryable()
                         let subQuery = from j in new[] { 2 }.AsQueryable()
                                        select j
                         select new
                         {
                             i,
                             count = subQuery.Count()
                         };

            var vistor = new QueryableMethodNormalizingExpressionVisitor();
            var visitedNotBroken = vistor.Visit(notBroken.Expression);
            var visitedBroken = vistor.Visit(broken. Expression);

mwwhited commented 1 year ago

We need the ability to either detect your breaking changes (especially changes which occurred for no reason as they previously worked)

Or to have the ability to ignore this functionality. These new validations that do not occur until runtime are preventing entire applications from being upgraded from .Net Core 3.1 (which I remind you was deprecated by Microsoft after only two years.)

Also, fundamental, runtime only breaking changes can we expect in the future? At least when you change classes and namespaces we can get compile time errors and develop a targeted fix plan. These runtime errors make it impossible to find these new "bugs" without manually testing the entire platforms.

Some of these "features" are even built into the query providers so there is not even a way to write unit tests to discover them.

On top of all of that this change did not improve the platform. In theory you added this change to "prevent a possible error" an error that already threw and exceptional on first pass testing so was already resolved. This change breaks many previously working embedded subqueries. That is not an improvement but is in fact a fundamental determent.

Also, we can not simply replace the implementation of this validation as it is buried too deeply within the EF provider behind some factory classes. The only way to swap it out would be to replace most of the Relational assembly.

What's worse is we can also not reasonably create a workaround with expression visitors as the required type conversions requires replacing anonymous and real types then ensuing the outer projection still matches the original query.

Please restore the previous functionality

And yes, the same reversion is needed in 6.0 as we need support on LTS and despite your claims 7.0 libraries are not compatible with 6.0 libraries.

mwwhited commented 1 year ago

@JoasE we decided this issue doesn't meet the bar for a patch, sorry. The workaround is quite straightforward. The exception message indicates which subquery fails the validation (although it incorrectly states its a top level projection). Just need to add ToList call around that subquery and it should work.

There is no reasonable work around. Yes, we have a fix but there is no way to detect the new runtime error without executing every possible permutation of our applications.

There is no reasonable way to perform that testing for any complex application.

roji commented 1 year ago

These new validations that do not occur until runtime are preventing entire applications from being upgraded from .Net Core 3.1 (which I remind you was deprecated by Microsoft after only two years.)

.NET Core 3.1 reached end of life 3 years after being released, as is the standard Long-Term Support policy - see these docs.

Also, fundamental, runtime only breaking changes can we expect in the future?

We try hard to avoid breaking changes in general, both ones which show up at compilation and those which show up at runtime only. However, we do introduce these in some cases, when we believe they're essential to the product, and benefit our user community more than the work they create. Without discussing this particular breaking change, requiring a product to never make any breaking changes effectively condemns it to stagnate.

These runtime errors make it impossible to find these new "bugs" without manually testing the entire platforms.

We strongly recommend investing in a robust automated test suite; sufficient coverage should detect the failing queries immediately and help you apply the trivial workaround to them. Even if we never purposefully introduced breaking changes into the product, unintentional ones may find their way in, and a good test suite is the only way for users to be sure that their application continues working across upgrades.

mwwhited commented 1 year ago

1) I know .Net Core 3.1 only had a 3 year lifetime and not the previous 10 years we've come to expect. But we did kinda expect you to care about backwards compatibility has your history before .Net Core had shown Microsoft to care about such things.

2) I oversee multiple application and projects with various levels of funding. Sorry we are all multi-billion dollar companies that can ignore our customers and just spend other people's money. Full automated integration testing with 100% code coverage is just not an option... and as I noted unit testing would not find this issue as there is no unit testing option that is compatible with the change able query provide model.

3) these are not actually bugs. These subqueries are perfectly fine and would be no different than using a view or a CTE in normal SQL operations. Infact these queries execute perfectly if you remove the Expression Visitor that throws an exception just because we have a partial query let x = some IQuerable<> that we reuse one or more times later. The outcoming projections do not have the type conversion exception this "feature" was claimed to "protect" us from as the actual execution points for that subquery already have the expected .ToList() or .XXXOrDefault() that would be required.

I am not alone in this adventure of running into these issues. Many people haven't even came across them yet because they are just starting to look into .Net Core 3.1 being deprecated. This is a whole I have been running against for several years now and I'm expected to be able to resolve this for all of my customer projects that now must be updated.

As a note, if we are expected to manually retest full stack end to end on every major release of .Net, Azure Functions, ASP.Net and Entity Framework just to name a few parts of our platform... we will have to consider looking at other ecosystems entirely. We can not retest and rewrite our entire application just to update our dependencies.

Suggesting that your clients that have invested hundreds of thousands of dollars is not millions of dollar and decades of development just "go do some more testing" is a very sad statement when it is very obvious you do not even make such attempts on your own. It's actually fairly insulting to those of us that have been using this stack from the beginning and apricated when Microsoft actually cared about us developers and provided solutions for us.

roji commented 1 year ago

I know .Net Core 3.1 only had a 3 year lifetime and not the previous 10 years we've come to expect. But we did kinda expect you to care about backwards compatibility has your history before .Net Core had shown Microsoft to care about such things.

I was just reacting to you saying .NET Core was "deprecated" after only 2 years. We certainly do care very much about backwards compatibility, but the bar for breaking changes is indeed a little lower than the .NET Framework days, when side-by-side installation was impossible.

Similar to how breaking changes are problematic for existing users and applications, never changing problematic behavior because it could break existing users is also quite problematic.

[...] as I noted unit testing would not find this issue as there is no unit testing option that is compatible with the change able query provide model.

Unit testing isn't your only option; we recommend comprehensive testing against your real database system to ensure that your application - and its queries - actually works end-to-end. We've invested in extensive documentation comparing testing approaches and providing guidance on how to write good tests which involve your real database system.

I am not alone in this adventure of running into these issues. Many people haven't even came across them yet because they are just starting to look into .Net Core 3.1 being deprecated. This is a whole I have been running against for several years now and I'm expected to be able to resolve this for all of my customer projects that now must be updated.

You may be right, but from the responses and votes above, we definitely haven't been seeing this affecting a great many users. In similar cases, we've been there to guide people on how to modify their queries to adjust to the new version, and it generally works well.

As a note, if we are expected to manually retest full stack end to end on every major release of .Net, Azure Functions, ASP.Net and Entity Framework just to name a few parts of our platform... we will have to consider looking at other ecosystems entirely. We can not retest and rewrite our entire application just to update our dependencies.

Again, I'd advise looking into automating this process - especially for the kind of complex platforms you're saying you have. Manual testing is indeed not scalable and cannot be done for every release, but then again, requiring or assuming that all your dependencies will never, ever break is just as unfeasible. I doubt you'll find other platforms and/or ORMs which provide an absolute guarantee to never break, unless they've effectively stopped evolving.

Suggesting that your clients that have invested hundreds of thousands of dollars is not millions of dollar and decades of development just "go do some more testing" is a very sad statement when it is very obvious you do not even make such attempts on your own. It's actually fairly insulting to those of us that have been using this stack from the beginning and apricated when Microsoft actually cared about us developers and provided solutions for us.

We care very deeply about EF Core and its users, and really do our best to avoid introducing needless breaking changes. We invest massive effort into our testing suite, precisely to avoid unintentional breaking changes creeping int; we have over 100k tests running with each commit (and growing), most of which access the actual databases to make sure queries work on those databases. But of course, no product or test suite can be perfect.

I do understand the frustration here, and this break indeed would not have happened. Unfortunately, this isn't something that we can simply "roll back", otherwise we definitely would do that - it's part of far more complex changes, and doing something here has a high risk of breaking other people who have already migrated to 6.0/7.0. It's an unfortunate situation and I'd prefer to be able to give better news, but that's how it is.

mwwhited commented 1 year ago

Imagine you have a river… it’s a bit dangerous but most people have no issues navigating it. In this river you have a fork. To the left it’s a bit more difficult to navigate down and sometimes a few boats tip over but if they tip over its immediately after that fork. Anyone that make’s it past that point is smooth sailing no issues at all for years.

One day someone annoyed by some people sometimes complaining after their boat tipping over decides to do something about the fork. They have some choices.

1) Hang a small sign at the ramp warning people about fork but not really doing anything else because it’s a small number of boats that tip over and an even smaller number that complain after it happens. (this would documenting the problem in docs.microsoft.com and in the readme for EF)

2) Hang the sign at the ramp and add a add a waiver to the boat ramp letting people know about the dangers. (this would be documenting the issue and maybe using code analysis to add a warning.)

3) Hang a fine print sign about the danger of the fork in the toilet at the pub down the street then toss grenades into the boat of anyone taking the fork in the future. (this is what you did.)

mwwhited commented 1 year ago

And BS. Removing the exception you added to QueryableMethodNormalizingExpressionVisitor would do nothing bad to those using EF 6+. It would allow them to use subqueries in their queries again without having runtime exceptions preventing them from writing reusable code.

mwwhited commented 1 year ago

BTW, you could also update the engine to check the expected projection and do the proper conversions at runtime allowing even for IQueryable<T> properties to be supported on projected entity models.

TWhidden commented 1 year ago

Throwing in a +1 for this issue - This is an unfortunate blocking issue for us also. We are in the process of converting Framework to Core and ran into this. We use sub queries to do sql filtering for event patterns so we don't have to do it in-memory. The above suggestion to do a .ToList() is not a solution. We are searching through millions of records, and don't need to bring them into our core app first to filter them down to 10 instances. Thats the job for SQL server.

Example of just one of our queries:

from posMain in context.POSDataEventView 
 let posSub = from posSub1 in context.POSDataEventView where 
   posSub1.CheckNumber == posMain.CheckNumber && 
   posSub1.EventTypeID == 26 && 
   posSub1.BasicData.ToLower().Contains("cash") &&
   posSub1.DataTimeStamp >= posMain.DataTimeStamp.AddMinutes(-1) && 
   posSub1.DataTimeStamp < posMain.DataTimeStamp.AddMinutes(5) &&
   posSub1.DataOwnerID == posMain.DataOwnerID
   select posSub1 
where 
 posMain.EventTypeID == 6  &&  posSub.Any()

select posMain

EF6 under Framework executes this just fine. It would be nice (actually - expected) to have that same long time supported feature in core.

Has anyone else found a work around permit the use of a sub query with the let statement?

mwwhited commented 1 year ago

Has anyone else found a work around permit the use of a sub query with the let statement?

It's pretty unlikely they are going to unbreak this. You might be able to get the query to execute by making it a subquery in the predicate by tossing the let and wrapping with parentheses. Won't look at clean but also shouldn't get this pointless exception.

JoasE commented 1 year ago

@TWhidden I'm pretty sure when I tested the workaround @maumar suggested I checked the query that was being sent to the sql server. I remember it being the same query, so the .ToList() call doesn't actually execute the subquery and load the results into memory. Maybe double check the EF logs with the ToList to see what it's actually doing. I also posted another workaround in one of my previous comments. It invloves rewriting the LINQ to remove the let statement. I think the LINQ will look more like what it's actually sending to the sql server.

My older post:

I experienced the same problem with the query below:

var defaultLanguageId = 1;
var selectedLanguageId = 2;

var query = from defaultTranslation in context.Translations
            where defaultTranslation.LanguageId == defaultLanguageId
            let translation = from translation in context.Translations
                                where translation.LanguageId == selectedLanguageId &&
                                    translation.Key == defaultTranslation.Key
                                select translation
            from selectedTranslation in translation.DefaultIfEmpty()
            select selectedTranslation ?? defaultTranslation;
var translations = originalQuery.AsNoTracking().ToList();

A possible workaround would be:

var query = from defaultTranslation in context.Translations
            where defaultTranslation.LanguageId == defaultLanguageId
            join selectedTranslation in context.Translations.Where(x => x.LanguageId == selectedLanguageId)
                on defaultTranslation.Key equals selectedTranslation.Key into ts
            from selectedTranslation in ts.DefaultIfEmpty()
            select selectedTranslation ?? defaultTranslation;

var translations = query.AsNoTracking().ToList();

This will result in the same SQL being executed using the SqlServer database provider 5.0.3

TWhidden commented 1 year ago

In our case, we could have several levels of sub-queries, depending on the pattern matching requirement in our application.

+mainrow
 + subquery1a
     +subquery1b 
 + subquery2a
     +subquery2b

I don't mind programmatically patching queries (there are potentially thousands are not known by the developers, but user created), but for sure expect a sub-query to be within many of them.

in this case, joins will not work for us. for one, way to complex to convert and cant patch existing user created ones programmatically.

Is there a reason that the EF Framework way was removed or broken for EF Core?

Can we write some kind of Visitor pattern to intercept this, and instead of throwing the exception, extending the SQL somehow? Just kinda shocked that this was removed from EFCore. I can't seem to find out why it was though, unless it's just a bug.

mwwhited commented 1 year ago

I tried doing this myself. Because of the way it is implemented you need to duplicate the majority of the EF Core framework.

I even tried writing a visitor that could scan the source code detect the issue so it could be reported and fixed but there were so many different combinations that trigger false positives it not practical to do. The only reliable fix I found was a full rewrite and as such the project based on .Net 3.1 and EF Core 3.1 was version locked as there wasn't funding to move forward with the upgrade.

As a note the suggested support from the EF team was to write integration tests that would allow for testing every possible code path... which again was not practical.

Before this "fix" was applied there were sometimes issues where poorly formed queries would cause runtime errors. Those errors would have been caught in initial development. This change with such a broad detection scope has so many false postives that no large code base with complex EF queries would be reliably upgraded without full integrated testing regress.

TWhidden commented 1 year ago

Makes sense - Just more work now on our part to move forward.

I was able to find a work around with what you said ( ChatGPT may have assisted here ) I'll have to spend some time to find out how I can patch user queries to do the same thing.

Working sub query:

from posMain in context.POSDataEventView
    where posMain.EventTypeID == 6 &&
        (
            from posSub1 in context.POSDataEventView
            where posSub1.CheckNumber == posMain.CheckNumber &&
                posSub1.EventTypeID == 26 &&
                posSub1.BasicData.ToLower().Contains("cash") &&
                posSub1.DataTimeStamp >= posMain.DataTimeStamp.AddMinutes(-1) &&
                posSub1.DataTimeStamp < posMain.DataTimeStamp.AddMinutes(5) &&
                posSub1.DataOwnerID == posMain.DataOwnerID
            select posSub1
        ).Any()
    select posMain

At least now there is some light at the end of the tunnel. For sure far from perfect - nobody wants to patch user queries, but a possible solution for sure.

douglasg14b commented 7 months ago

We strongly recommend investing in a robust automated test suite; sufficient coverage should detect the failing queries immediately and help you apply the trivial workaround to them

What is the trivial workaround for nested queries here? I feel like I'm missing something :/

x.Orders.AsQueryable().Select(OrderToOrderResponseProjection).ToList() appears to result in the same issues 🤔

mwwhited commented 7 months ago

There is no trivial work around… though your issue is the query was getting upgraded to line to objects. Remove the .AsQueryable and make sure that names function is return an expression tree with no sub queries. The undetectable breaking change is an alteration in how subqueries are detected. They try to protect you from the linq to object transparent query by killing all sub queries that do not return a collection.

there isn’t a good way to statistically check go of the possible broken code points so you will have to test your entire application surface that uses EF. And that will have to be don’t with integration tests as unit tests with inmemorydb do not have this issue.

it’s a PIA. Good luck.

douglasg14b commented 7 months ago

Remove the .AsQueryable and make sure that names function is return an expression tree with no sub queries

In my case I cannot since OrderToOrderResponseProjection is an uncompiled expression (Intentionally so). And there doesn't appear to be an IEnumerable Select that accepts an Expression<Func<>>, and I have not figured out how to sort that out myself yet 🤔

Thankfully in my case I am building a new application, but some of the projections are large, and in some cases have their expression trees dynamically and recursively modified at query-time, which necessitates the use of IQueryable.Select.

mwwhited commented 7 months ago

Yeah, you have to project them into a list or collection now. They should still be composable but who knows without testing. This change is highly problematic for many of us and is often requiring rewrites for fairly new applications… and if you look at the threat you can see how much the EF team didn’t care that they made such a massive breaking change. They annoying thing is the “protection” they added prevents not only code that would have already thrown runtime errors but also prevents subquery reuse even just in predicates which would have never had the casting issue they were “fixing”. (Which they didn’t actually fix they just throw a more informative exception more often.)

dotnet / efcore

Query: IQueryable in subquery sometimes triggers validation which prevents IQueryables in the final projection #23302