Closed smitpatel closed 2 years ago
Don't waste your time with this useless enhancement. This definitely is NOT what I wanted. Client eval, seriously? If it wasn't clear after all these writings in the now closed issue, I want FULL SERVER support for this LINQ operator. And btw, it has nothing in common with GroupBy
. It's a correlation operator similar to Join
, but producing correlated set rather than flattened set. Putting it into the same box as GroupBy
is a BIG mistake. IMHO. This my last comment here, luckily I'm not using EF Core in production, hence I don't really care, except for the unlucky people which do use and I'm trying to help overcoming the unnecessary limitations put by EF Core team. We all know LINQ to Queryable is broken, but EF Core makes that a rule rather than an exception. The sad thing is that even though EF6 has no client eval concept, it translates and evaluates much more LINQ constructs than EF Core, so it's not surprising that some people actively propose EF6.4 or EF Classic as better EF Core alternatives. Which is sad because I like the simplicity of the EF Core exposed metadata, also the replaceable services architecture, but w/o proper query translation it is just not useful in real world applications. No more comments.
Really last comment :-) (hopefully constructive)
Here is quick and dirty EXTERNAL implementation of what I wanted:
public static partial class QueryableExtensions
{
public static IQueryable<T> ConvertGroupJoins<T>(this IQueryable<T> source)
{
var expression = new GroupJoinConverter().Visit(source.Expression);
if (source.Expression == expression) return source;
return source.Provider.CreateQuery<T>(expression);
}
class GroupJoinConverter : ExpressionVisitor
{
protected override Expression VisitMethodCall(MethodCallExpression node)
{
if (node.Method.DeclaringType == typeof(Queryable) && node.Method.Name == nameof(Queryable.GroupJoin) && node.Arguments.Count == 5)
{
var outer = Visit(node.Arguments[0]);
var inner = Visit(node.Arguments[1]);
var outerKeySelector = Visit(node.Arguments[2]).UnwrapLambdaFromQuote();
var innerKeySelector = Visit(node.Arguments[3]).UnwrapLambdaFromQuote();
var resultSelector = Visit(node.Arguments[4]).UnwrapLambdaFromQuote();
var outerKey = outerKeySelector.Body.ReplaceParameter(outerKeySelector.Parameters[0], resultSelector.Parameters[0]);
var innerKey = innerKeySelector.Body;
var keyMatch = MatchKeys(outerKey, innerKey);
var innerQuery = Expression.Call(
typeof(Queryable), nameof(Queryable.Where), new[] { innerKeySelector.Parameters[0].Type },
inner, Expression.Lambda(keyMatch, innerKeySelector.Parameters));
var resultTypes = resultSelector.Parameters.Select(p => p.Type).ToArray();
var tempProjectionType = typeof(Tuple<,>).MakeGenericType(resultTypes);
var tempProjection = Expression.New(
tempProjectionType.GetConstructor(resultTypes),
new Expression[] { resultSelector.Parameters[0], innerQuery },
tempProjectionType.GetProperty("Item1"), tempProjectionType.GetProperty("Item2"));
var tempQuery = Expression.Call(
typeof(Queryable), nameof(Queryable.Select), new[] { outerKeySelector.Parameters[0].Type, tempProjectionType },
outer, Expression.Lambda(tempProjection, resultSelector.Parameters[0]));
var tempResult = Expression.Parameter(tempProjectionType, "p");
var projection = resultSelector.Body
.ReplaceParameter(resultSelector.Parameters[0], Expression.Property(tempResult, "Item1"))
.ReplaceParameter(resultSelector.Parameters[1], Expression.Property(tempResult, "Item2"));
var query = Expression.Call(
typeof(Queryable), nameof(Queryable.Select), new[] { tempProjectionType, projection.Type },
tempQuery, Expression.Lambda(projection, tempResult));
return query;
}
return base.VisitMethodCall(node);
}
static Expression MatchKeys(Expression outerKey, Expression innerKey)
{
var newOuterKey = outerKey as NewExpression;
if (newOuterKey == null)
return Match(outerKey, innerKey);
var newInnerKey = (NewExpression)innerKey;
Expression keyMatch = null;
for (int i = 0; i < newOuterKey.Arguments.Count; i++)
{
var match = Match(newOuterKey.Arguments[i], newInnerKey.Arguments[i]);
keyMatch = keyMatch != null ? Expression.AndAlso(keyMatch, match) : match;
}
return keyMatch;
}
static Expression Match(Expression left, Expression right)
=> Expression.Equal(left, right);
}
}
When applied, it allows translation/processing (in 3.1.1) of all GroupJoin
scenarios.
The question is, if I can do that externally, why can't you do something similar internally (using all the power of your expression processing helpers/experience) :-/ Cheers.
I think this is a good enhancement because without it, the (general) EF Core 3.x advice to manually switch to client-side evaluation does exactly what has been often commented as worth avoiding in automatic client side evaluation, i.e. pulling a lot of data from the server unnecessarily.
At least, in something like
var ans = context.Customers
.Where(c => c.CustomerID < 100)
.GroupJoin(context.Orders, c => c.CustomerID, o => o.CustomerID, (c, os) => new { c, os })
.ToList();
where could you put AsEnumerable
that wouldn't involve pulling a lot of unnecessary Orders rows?
(I hope this case would be handled.)
It would be really good if this supports projection at least so that unneeded columns aren't pulled to the client as in:
var ans = context.Customers
.Where(c => c.CustomerID < 100)
.GroupJoin(context.Orders, c => c.CustomerID, o => o.CustomerID, (c, os) => new { c.CustomerID, OrderIDs = os.Select(o => o.OrderID) })
.ToList();
I am still not sure if translating to a single correlated SQL query versus two queries to minimize data transfer would be most efficient, but I suppose the answer would very depending on the database.
context.Customers
.Where(c => c.CustomerID < 100)
.Select(c => new {
c = c,
os = context.Orders.Where(o => c.CustomerID == o.CustomerID).ToList()
})
is equivalent query. LINQ is magical.
context.Customers .Where(c => c.CustomerID < 100) .Select(c => new { c = c, os = context.Orders.Where(o => c.CustomerID == o.CustomerID).ToList() }) is equivalent query. LINQ is magical.
Yes, it is. That's why the EF job is to convert/transform/treat the original as the manually written equivalent.
SQL also has different valid query constructs (WHERE IN (…), WHERE EXISTS (…),
JOIN`) which can produce equivalent results, but they are not forcing people to use just one of them. Modern query optimizers recognizes them and create one an the same execution plan. And EF Core is intended to be a "modern" framework, isn't it?
P.S. Sorry for repeating/linking to it couple times, but see the ConvertGroupJoins
transformation from https://github.com/dotnet/efcore/issues/19930#issuecomment-587476008 and then below applied to @NetMage example and your response:
var qA = context.Customers
.Where(c => c.CustomerID < 100)
.GroupJoin(context.Orders, c => c.CustomerID, o => o.CustomerID, (c, os) => new { c, os });
var qB = context.Customers
.Where(c => c.CustomerID < 100)
.Select(c => new {
c = c,
os = context.Orders.Where(o => c.CustomerID == o.CustomerID).ToList()
});
var qC = qA.ConvertGroupJoins();
// qC is same as qB, should be done automatically by EF :)
Yes, it is. That's why the EF job is to convert/transform/treat the original as the manually written equivalent.
While in theory that's true, given that the amount of work we have is huge, we prioritize working on translations (and features in general) which don't have easy/simple workaround such as this one. If it's possible to rewrite your LINQ query so that it works - and is even simpler as in the case above - isn't that reasonable?
Yes, it is. That's why the EF job is to convert/transform/treat the original as the manually written equivalent.
While in theory that's true, given that the amount of work we have is huge, we prioritize working on translations (and features in general) which don't have easy/simple workaround such as this one. If it's possible to rewrite your LINQ query so that it works - and is even simpler as in the case above - isn't that reasonable?
@roji No, it isn't, because you want be to do something additional work which is not strongly necessary. Because my time is costly too.
Just out of curiosity, why you spent your valuable time writing the (IMHO more complicated) GroupJoinFlatteningExpressionVisitor
to support the weird LINQ left join pattern? You could have asked the end users to rewrite their LINQ queries using the even simpler correlated subquery syntax, right?
Why don't you go further and remove the whole LINQ support and ask the end users to rewrite their queries using some "even simpler" but EF Core specific dialect for their queries?
Finally, it's a shame the "modern" framework translating less LINQ patterns than its "obsolete" predecessor. Just FYI :)
Once again, where we can, we add translations based on the implementation cost, on whether a different construct exists to express the same thing, and importantly, based on user feedback. This issue has received almost no votes or feedback from users, which is a main reason why we haven't prioritized it.
That is nice, could it not be used as a GroupJoin
translation?
@roji
This issue has received almost no votes or feedback from users,
Because the victims go to StackOverflow instead of here?
@NetMage This is what I'm fighting for here. But they "see no value" :(
This issue has received almost no votes or feedback from users,
Because the victims go to StackOverflow instead of here?
We do have other issues with hundreds of votes - this one does not. Note that @smitpatel - who is a team member - is the one who opened this issue in the first place, so it's not true that we see no value. But at the risk of repeating myself, this issue has very little user requests, and there is another LINQ construct that produces the same result (one could even say that it's simpler). Therefore, since we have an enormous list of things to do and our resources are limited, this issue is in the backlog (not closed).
@powermetal63 Theoretically, if one cares enough, an enhancement could be done by someone from the community. Unfortunately having a usable EF Core is not my primary job, but I am working on a C# extension method using this translation in my spare time.
@roji From my point of view this is continuation of #17068, which was closed by @smitpatel with the following conclusion
GroupJoin operator GroupJoin is mainly used to generate LeftJoin and there is no SQL equivalent. Hence we will not translate any composition over GroupJoin. Filed #19930 to client eval GroupJoin when it is final query operator and grouping is not being composed over. In all other cases, GroupJoin is not going to supported. GroupJoin <-> GroupBy equivalence Due to high cost of pattern matching GroupJoin operator, we are not going to add any translation in GroupJoin even if equivalent GroupBy query would be translated. Customers would be required use GroupBy syntax as that is closest representation of what is the SQL.
And here are some related issues recently closed as duplicate of it - #13887, #14490, #20660
What I'm trying to argue from the beginning is that putting GroupJoin
into same box as GroupBy
is wrong. And what I really want is reopening a separate issue for (or changing this to) Query: Support GroupJoin translation. With the following rationale:
As you can see from my sample code, doing that is quite easy even with external code. You already put significant effort in 3.x to support just one variation - LINQ left join pattern using SelectMany on GroupJoin.DefaultIfEmpty, and even additional effort for porting (refactoring) it into this huge TryFlattenGroupJoinSelectMany method.
But running my ConvertGroupJoins
method not only solves the GroupJoin
translation in general, but also totally eliminates the need of that TryFlattenGroupJoinSelectMany
code (because there are no GroupJoin
s in the query expression tree at all!).
What I'm suggesting is replacing TryFlattenGroupJoinSelectMany
with GroupJoin
transformation similar to my sample code (adjusted with your internal coding standards) and voila - issue solved once forever. All that with much less code than currently.
@powermetal63 What is ReplaceParameter
in your example?
@sungam3r Nothing special, a typical expression visitor based helper for finding ParemeterExpression
instances inside expression tree and replacing them with arbitrary Expression
. Something like
public static partial class ExpressionUtils
{
public static Expression ReplaceParameter(this Expression source, ParameterExpression parameter, Expression value)
=> new ParameterReplacer { Parameter = parameter, Value = value }.Visit(source);
class ParameterReplacer : ExpressionVisitor
{
public ParameterExpression Parameter;
public Expression Value;
protected override Expression VisitParameter(ParameterExpression node)
=> node == Parameter ? Value : node;
}
}
@powermetal63 Can you provide a sample of the transformation your ConvertGroupJoins
does on a query?
This issue has made me realize that there is a serious pain point with EF Core in its current state (5.x). It is something I have run into quite a few times before, but I didn't see the patter up till now.
Although EF Core is mostly nice and allows you to do many things, the current state of its Linq API isn't nice:
Solution per point (not sure which of these are feasible):
SQL doesn't have GroupJoin operation on server because it cannot represent IEnumerable which grouping generates. There is no alternative for it. So neither of above raised point helps apart from whatever is stated already in this issue.
Sorry that none of my points were helpful. I was only trying to voice my experience as a user, including trying to suggest some ideas that might lead to improvements. You have to realize that knowledge about ORM library implementations is so specific, that nearly none of the users actually understand the inner workings (even though they may be experienced developers / architects)
If I read your comment correctly you are saying that LINQ makes for a bad / leaky abstraction of SQL:
Now I know that on some level every ORM will suffer from limitations of their abstraction, which is why most ORMs have an option to execute raw SQL.
This makes me wonder:
Can you share any other abstraction which is familiar, intuitive and doesn't require user to learn altogether new query language? There is one abstraction which upto an extent can be made work by introducing operations which are supported by database but that abstraction would also run into issue because EF Core is not tied to one specific database type. There is also variation on what works on one database compared to another. So we would always be in a boat where regardless of abstraction chosen, some query won't always work.
I do have some experience with Spring Data JPA, but not enough to actually judge whether it is 'better' than EF Core (which even then would probably be quite subjective).
Now I believe the approach there is:
Just as an idea: it might be really insightful for the EF Core team to talk with teams that build stuff like this (Spring JPA / Hibernate etc). Looking 'over the hedge' at what people are doing in other ecosystems can lead to very cool new ideas.
There are other 3rd party libraries which integrate on top of EF Core and provide all those functionality. OData is one big consumer with EF Core which provides paging/sorting/filtering functionality in the way you mention above without users having to write whole query themselves. We work with them so the queries generated by them are translated correctly in EF Core. There also other extension on top of EF Core to provide more functionality from database side & also other 3rd party library which can be used to create queries.
OUCH! I have LINQ queries with 'into' clauses all over my Legacy code. I have just tried to upgrade that old Entity Framework v5.0 to Entity Framework core and hit this issue almost immediately. Stop removing features that used to work! I am tired of all the products being released with breaking changes and products that provide no upgrade path to get from a prior version to a new version. I am leaning towards abandoning Entity Framework as the upgrade hassle is beginning to outweigh the benefits. We just upgraded 30-year old legacy code and the SQL statements just ported straight into the new app! That's a big plus for just staying with good old SQL.
This issue is about enabling following query.
We will not client eval GroupJoin in any of following cases
os
in above query in any formAll above unsupported cases will still throw client evaluation exception even after this issue is fixed.
GroupJoin operator in query which is used to generated LeftJoin in SQL as described in our documentation is supported and will remains so. Even after that use case, if you still need EF Core to client eval GroupJoin operator with aforementioned constraints please upvote this issue. We are not looking to client eval any other GroupJoin which appears in the middle of the query.