dotnet / csharplang

The official repo for the design of the C# programming language
11.32k stars 1.02k forks source link

Proposal: extend the set of things allowed in expression trees #2545

Open gafter opened 9 years ago

gafter commented 9 years ago

Expression trees support only a subset of the things that could be semantically meaningful. Consider allowing them to support the following constructs:

Mr-Byte commented 9 years ago

I think allowing the null operators in expression trees would be one of the most useful, especially in regards to frameworks that provide LINQ support for databases.

jdh28 commented 9 years ago

To add to your list, even subscription and unsubscription.

VictorBlomberg commented 9 years ago

I believe that supporting statement-bodied lambdas would make expression trees a lot more interesting in many real world scenarios. I posted to the Visual Studio UserVoice site a while back, and it seemed like there is some interest in this.

olmobrutall commented 9 years ago

+1 for enabling ?. in expression trees. I did a pull request some months ago (https://roslyn.codeplex.com/SourceControl/network/forks/Olmo/NullPropagationInExpression/contribution/7700) and discuses the idea (https://roslyn.codeplex.com/discussions/571077). It's not complicated but tell me if you're interested in and updated version.

olmobrutall commented 9 years ago

Also Invocations that use default or named parameters in expression trees will be nice :)

kbirger commented 9 years ago

I don't think that dynamic would be possible, because it is handled runtime, which would prevent the ability for the expressions to be handled during compilation.

D3-LucaPiombino commented 9 years ago

I am also very much interested in allowing statement-body like @VictorBlomberg for dsl purposes. It would open some interesting scenarios in manipulating expression trees that right now are a pain.

iSynaptic commented 9 years ago

We now have many different ways to represent code as an object model - CodeDom, LINQ Expression Trees, and now Roslyn. Since Roslyn is the most complete of all the models (and will continue to be so) is there anyway it could become the basis for lambda expressions going forward? Perhaps by introducing something like a RoslynExpression (name not important) in order to support backward comparability by not changing Expression (and ecosystem). This way there isn't a continuous need to create parallel representations of lambda expressions to support new language features.

That said, a couple things jump out as making this a less ideal approach:

Thoughts?

svick commented 9 years ago

@iSynaptic I wouldn't consider the CodeDOM object model for any new code. It doesn't support even some of the most basic things (like static classes).

And regarding Roslyn and Expression trees, I would actually like to see expression trees expand to be able to do more things (e.g. that you could generate whole types using them), not the other way around, like you're proposing. I think that Expression trees and Roslyn syntax trees have very different primary use cases and because of that there are significant differences between them. And those differences make the kind of metaprogramming you want to do with Expression trees much harder to do with Roslyn.

Some of the differences include:

hivanov commented 9 years ago

I probably think the Expression Trees have the following huge weaknesses:

  1. Not reclaimable by GC. Expression.Compile() and friends tend to eat up your memory, thus a special caching logic is needed all the time;
  2. Not suitable for writing types -- only static methods are supported. This is probably pain number one. Even if you could compile them into Type methods, the generated methods are still static. This prevents proper proxying of some classes (think Castle.DynamicProxy and friends on steroids). The result is having custom boilerplate to handle class state. There is a whole host of problems that could be solved by expression trees type generation. Think for example:
    • Code that optimizes itself on-the-fly based on configuration parameters;
    • Custom language parsers (in combination with ANTLR and friends);
    • Dynamic code generation (generic interface implementation, handy for, say, network communication).
mkosieradzki commented 9 years ago

@hivanov On the other hand great thing about Expressions: it's currently the only way to run "dynamic code" (without writing own interpreter) inside Windows Store Applications as as far as I understand correctly they are even .NET Native compatible. If tendency is to go towards AOT instead of JIT: dynamic types are probably not coming back.

Regarding GC issues: I am using a lot of Expression.Compile and never had memory issues with them (maybe due to the caching logic) - I must check this out.

And of course: +1 for "null-coalescing operators".

axel-habermaier commented 9 years ago

@mkosieradzki: Regarding System.Linq.Expressions.Expression in .NET Native: Yes, as far as I'm aware, they are supported, albeit not in a compiled way (no runtime code generation is possible), but in a slow, interpreted way.

tmat commented 9 years ago

@axel-habermaier Turns out that the interpreted way is actually faster than the compiled way if the expression is executed less than ~50 times.

mkosieradzki commented 9 years ago

It's not slow. It's good-enough for scenarios I am using it for (scripting inside application).

@hivanov I am unable to reproduce memory leak behavior when using Compile. I have tried to create millions of different expressions, evaluate them and have literally no memory leaking. I would expect at least one "byte" leaking per expression but after GC memory returns to the initial usage. Are you sure there is a memory leak issue related to Expressions?

hivanov commented 9 years ago

@mkosieradzki It actually depends on the expressions you build. I have found out, on numerous occasions, that expressions that reference external stuff (mostly, mapped external variables) fail to be released even if the referencing code + the variables are no longer reachable. Especially if the variables are something like channel description.

I still stand behind my general idea for the extension of the Expression Trees to be able to generate whole classes or, at least, methods that support "this", so they could be used in TypeBuilders.

drub0y commented 8 years ago

+1 for statement bodied lambdas.

My use case for this has always been the ability to write GPU programs against a framework that just compile as expressions in C# and then get hoisted and transpiled to a GPU specific byte code at runtime such as nVidia's CUDA PTX or even higher level stuff like DirectX/OpenGL's shaders or OpenCL via some kind of provider model.

exyi commented 8 years ago

:+1: for all these features. I really love expression trees and it would be nice to have more support from the language. It could also be nice to have something like [ReflectedDefinition] attribute (similar to F#) to get expression tree from method. Imagine having a ORM which will save the method to DB and allow using it in linq queries.

olmobrutall commented 8 years ago

@exyi this is actually a really good idea. If you can expose Properties and Methods as expression trees you can factor-out a lot of complexity of LINQ queries in reusable parts.

We use a cumbersome convention involving declaring the method body as an static Expression<T> field, an ExpressionFieldAttribute and some MsBuild / Cecil magic to bind those together.

https://github.com/signumsoftware/framework/commit/a7e369966d99dc8a6fc36eb7a0b7227d50223f30

public class PersonEntity
{
     (...)

     //How it is today
     static Expression<Func<PersonEntity,bool>> IsAmericanExpression = p=>p.Country == "USA"; 
     [ExpressionField] 
     public bool IsAmerican
     { 
        get { return IsAmericanExpression.Evaluate(this); }        
     }

    //How it could be
    [ExpressionDefinition]
    public bool IsAmerican => this.Country == "USA";
 }

@gafter If I do a pull request for this, any chance it will get accepted?

D3-LucaPiombino commented 8 years ago

@olmobrutall +1. We have the same problem. I think that the scenario you are describing is very common in a lot of application that are, for example, Entity Framework based where one want to build a set of expression and reuse them by combining them.

exyi commented 8 years ago

@olmobrutall It's quite often good idea to steal F# features :). I wanted it to use in DotVVM framework to translate simple functions to javascript. It would be quite similar what FunScript do, since they use F#'s ReflectedDefinition attribute.

alrz commented 8 years ago

@gafter Isn't it planned to make Expression classes a record type so the following would be possible?

switch(expr) {
case LambdaExpression(UnaryExpression(MemberExpression memberExpr)):
case LambdaExpression(MemberExpression memberExpr):
  ...
}

// instead of 

switch(expr) {
case LambdaExpression { Body is UnaryExpression { Operand is MemberExpression memberExpr } }:
case LambdaExpression { Body is MemberExpression memberExpr }:
  ...
}

// (I'm using OR patterns, don't mind)

Also, is there any ExpressionType like enum generated for record types as an optimization for pattern matching (like Tag in F#) or it's just type checking?

gafter commented 8 years ago

@alrz we do not currently have any such plan.

gulshan commented 8 years ago

Should the pure and impure expressions be differentiated in C#? May be that will enable some enhancements in the cases of pure expressions. Or this has been done already?

Alphish commented 8 years ago

+1 for more Expression Trees support, especially the null-propagating accessors or statement-bodied lambdas (as far as I know the latter are already possible, except it requires using troublesome Expression Trees API to build them). Wouldn't like these to lag too far behind the existing C# features.

On a side note, would it be possible to support async/await with LINQ expressions as well, or is it too much of a stretch? According to this SO thread it'd require a major compiler rewrite, but I'd like to have some official C# devs stance to refer to. ^^'

kbirger commented 8 years ago

Think about the problem conceptually. The await keyword actually generates some syntax precompile which generates a state machine (You can find various levels of detail for this online, but here - for example: http://www.filipekberg.se/2013/01/16/what-does-async-await-generate/). It tracks things like

You wouldn't have an expression so much as a complex set of imperative code expressed using symbols which make it seem functional. It would DEFINITELY require a major compiler rewrite. It would also inject lots of complexity to your code and make it very difficult to debug.

bbarry commented 8 years ago

@bartdesmet has done a bunch of prototype work seemingly related to this issue: https://github.com/bartdesmet/ExpressionFutures/tree/master/CSharpExpressions

HaloFour commented 8 years ago

One thing that I've always wished that I could do with expression trees in C# was to be able to define a helper or extension method that could be invoked from within the expression but instead of a MethodCallExpression being emitted a helper method would be invoked that would expand into a portion of the expression tree.

:spaghetti:

[ExpressionExpansion(MethodName = nameof(IsBetweenExpression))]
public static bool IsBetween(this int value, int minimum, int maximum) {
    return (value >= minimum) && (value <= maximum);
}

public static Expression IsBetweenExpression(Expression value, Expression minimum, Expression maximum) {
    return Expression.AndAlso(
        Expression.GreaterThanOrEqualTo(value, minimum),
        Expression.LessThanOrEqualTo(value, maximum)
    );
}
alrz commented 8 years ago

@HaloFour With #8990 it will resolve into a pattern. So you don't need to worry about the MethodCallExpression itself.

HaloFour commented 8 years ago

@alrz

I don't see how that proposal is related. I wouldn't be interpreting these expression trees myself, I'd be passing them to some provider like Entity Framework. Unless you're suggesting that I wrap EF and translate the expression trees prior to it then interpreting them.

alrz commented 8 years ago

@HaloFour Ok I don't know what problem you're trying to solve.

HaloFour commented 8 years ago

@alrz I could have been more specific. The problem I'd like to solve is to be able to write a helper method or extension method that could be invoked within the body of a LINQ query that is used with a queryable provider such as Entity Framework. Today any such method calls are interpreted simply as MethodCallExpression which Entity Framework itself wouldn't understand (there are some hoops I can jump through with EF to make that possible, but that's not true of every queryable provider).

alrz commented 8 years ago

I think this is somehow related to F#'s ReflectedDefinitionAttribute so that expression tree of the function body will be available at runtime, then your example would be

[ReflectedDefinitionAttribute]
public static bool IsBetween(this int value, int minimum, int maximum) {
    return (value >= minimum) && (value <= maximum);
}

or something like that. Because compiler already can generate expression tree for IsBetween method body — there is no reason for you to write it by hand.

HaloFour commented 8 years ago

@alrz That would be pretty awesome.

svick commented 8 years ago

@HaloFour

there are some hoops I can jump through with EF to make that possible, but that's not true of every queryable provider

There is a workaround that works for any provider: LINQKit.

bartdesmet commented 8 years ago

@Alphish, @kbirger - The link shared by @bbarry contains a fork of the Roslyn compiler and a new runtime library Microsoft.CSharp.Expressions with expression tree support for all the nodes referred to in this post (up to and including statement nodes), including async lambdas and await expressions. The expression tree capturing logic runs prior to the lowering stage for async methods (which generates the state machine) and hence captures the user intent at a high level, e.g.:

Expression<Func<Task<int>, Task<int>> f = t => 2 * await t;

will turn into:

var t = Expression.Parameter(typeof(Task<int>), "t");
var e = CSharpExpression.AsyncLambda<Func<Task<int>, Task<int>>(Expression.Multiply(Expression.Constant(2), CSharpExpression.Await(t)), t);

When reducing the async lambda expression node, it performs a similar rewrite to the one carried out by the C# compiler but now at runtime. Effectively, it reduces the async lambda and await expression nodes into a state machine implementation, dealing with complexities such as re-entering try blocks, pending exceptions and branches, support for catch and finally, stack spilling, ref locals, etc. This supports Compile at runtime.

If one wants to analyze such a C#-specific expression, one would have to use the CSharpExpressionVisitor which has Visit methods for these nodes, without causing early reduction to more primitive expression nodes found in System.Linq.Expressions.

While this work is in a pretty good shape, there's no concrete plan (yet) to integrate this into mainstream C#. Much will depend on the community amplifying the ask for this functionality in order to rank it in the priority list. One significant work item will be to specify the behavior of expression tree conversion for lambdas more rigorously than was done before in the 3.0 days (which in turn led to tricky backwards compatibility work in the Roslyn code base as @VSadov can confirm).

Personally, I'm willing to help champion this feature given the desire within my team here in Bing to have this functionality. We appreciate your input, any feedback, suggestions for prioritization of expression tree features, etc.

kbirger commented 8 years ago

Very cool. Have any performance tests been run on this? On Feb 22, 2016 10:20 PM, "Bart J.F. De Smet" notifications@github.com wrote:

@Alphish https://github.com/Alphish, @kbirger https://github.com/kbirger - The link shared by @bbarry https://github.com/bbarry contains a fork of the Roslyn compiler and a new runtime library Microsoft.CSharp.Expressions with expression tree support for all the nodes referred to in this post (up to and including statement nodes), including async lambdas and await expressions. The expression tree capturing logic runs prior to the lowering stage for async methods (which generates the state machine) and hence captures the user intent at a high level, e.g.:

Expression<Func<Task, Task> f = t => 2 * await t;

will turn into:

var t = Expression.Parameter(typeof(Task), "t");var e = CSharpExpression.AsyncLambda<Func<Task, Task>(Expression.Multiply(Expression.Constant(2), CSharpExpression.Await(t)), t);

When reducing the async lambda expression node, it performs a similar rewrite to the one carried out by the C# compiler but now at runtime. Effectively, it reduces the async lambda and await expression nodes into a state machine implementation, dealing with complexities such as re-entering try blocks, pending exceptions and branches, support for catch and finally, stack spilling, ref locals, etc. This supports Compile at runtime.

If one wants to analyze such a C#-specific expression, one would have to use the CSharpExpressionVisitor which has Visit methods for these nodes, without causing early reduction to more primitive expression nodes found in System.Linq.Expressions.

While this work is in a pretty good shape, there's no concrete plan (yet) to integrate this into mainstream C#. Much will depend on the community amplifying the ask for this functionality in order to rank it in the priority list. One significant work item will be to specify the behavior of expression tree conversion for lambdas more rigorously than was done before in the 3.0 days (which in turn led to tricky backwards compatibility work in the Roslyn code base as @VSadov https://github.com/VSadov can confirm).

Personally, I'm willing to help champion this feature given the desire within my team here in Bing to have this functionality. We appreciate your input, any feedback, suggestions for prioritization of expression tree features, etc.

— Reply to this email directly or view it on GitHub https://github.com/dotnet/roslyn/issues/2060#issuecomment-187504256.

gulbanana commented 8 years ago

+1, I hit a BC37240 today..

stevozilik commented 7 years ago

+1 for the The null-coalescing operators such as a?.b, a?[b] Would help with mocking/testing EF queries

jnm2 commented 7 years ago

Wow, two years. Is there any part of this proposal that is up for grabs in case someone comes by who has time?

praeclarum commented 6 years ago

+1 for null propagation operators. I maintain a SQL ORM and people love to use this operator.

Personally, I would love statements to be supported.

Serguzest commented 6 years ago

null propagation operator in expressions needed more with EF core since it keeps falling backs to client side evaluation and ends up throwing null reference exception

buzzytom commented 6 years ago

Statement body expression trees or a C# meta expression tree language would be the dream (even if it was completely separate from the current Expression.

You could then do some very cool things like transpiling C# into GPU shader code (at runtime?)! Yes, you could do this now, but you would have to use the Expression API to build the trees manually and it would be very clunky.

If this was implemented (with the current Expression type), it would break a lot of existing LINQ libraries.

olmobrutall commented 6 years ago

@buzzytom

If this was implemented (with the current Expression type), it would break a lot of existing LINQ libraries.

I don't think it won't be so bad to add new expression trees even if old LINQ providers won't be able to translate those. Many (all?) the LINQ providers are not complete anyway. For exmple you can not translate to SQL every C# expression.

Also, some expressions, like ?., could be reducible.

buzzytom commented 6 years ago

With how Expressions currently exist, most (if not all) expressions can be translated to SQL. Given statement bodies are introduced, suddenly the consumers of these frameworks will be able to use a myriad of expression structures that can't be translated into SQL. I just feel this should be a consideration when designing a solution to this.

I agree with the null propagation being reducible.

Maybe a separate StatementBodyExpression class that accepts an extended C# meta language, accepts the existing expression objects:

StatementBodyExpression<Action<int>> expression = x =>
{
    CalculateThing(x);
    DoOtherThing();
};
svick commented 6 years ago

@buzzytom

With how Expressions currently exist, most (if not all) expressions can be translated to SQL.

Blocks, assignments or loops, which can't be easily translated to SQL, already exist in the Expression tree API. So consumers can already use them. The fact that they can't be created from C# lambdas doesn't matter much.

olmobrutall commented 6 years ago

@buzzytom

The expression ctx.Invoices.Where(inv => IsPrime(inv.Id)) where IsPrime is a function written in C# can also not be translated to SQL. So the limitation is already there, just getting bigger.

buzzytom commented 6 years ago

@svick As you've pointed out, with the Expression tree api. It is an extra guard against incompatible expressions. It does not capture method calls as @olmobrutall pointed out.

@olmobrutall To be honest that's a fair point. There are subtle reasons for why methods are allowed e.g: Contains, sub-queries and all those other anomalies.

I shall point out, I'm not arguing for any particular solution, I'm just pointing out some of the reasons they may not currently be allowed. Maybe the Roslyn team may be able to clarify some points?

mcintyre321 commented 6 years ago

You can map custom methods to database UDFs in Entity Framework.

If a method isn't supported it's a limitation of the LINQ Provider, or it's configuration/usage.

olmobrutall commented 6 years ago

You can not map every method on .Net to a UDF.

That’s the point. LINQ providers have limitations. Using c# instead of SQL is better because of statically types and consistency for trivialities like Substring being 0-based indexes, but you need to know that you have SQL under the covers .

buzzytom commented 6 years ago

To be honest, at this point, I'm wondering how this can be stagnant for this long. Is it simply that it's just an edge case that few would actually use?