Open benaadams opened 7 years ago
Yes, I also realize that an IL rewrite step needn't be limited to things that are in scope for roslyn-linq-rewrite. This thread has many specific suggestions of those. I am trying to ask a question specifically about the linq-related suggestions that have been made on this thread, please don't interpret it as a statement on anything beyond that. Since my question keeps getting buried by answers to a more general one that wasn't what I was trying to ask, I'll repeat it: for the sort of transforms that roslyn-linq-rewrite performs, is there a benefit/desire to have an IL rewrite step perform those same ones? If so, why, and if not, what similar transforms have people had in mind when pointing to it as an example?
Sorry. I had misunderstood your question originally, my bad 😄 (if I still managed to derail your question below, just ping me and I'll remove).
for the sort of transforms that roslyn-linq-rewrite performs, is there a benefit/desire to have an IL rewrite step perform those same ones?
I don't (personally) think there is any major benefit to having the transforms roslyn-linq-rewrite
performs also duplicated in an IL rewrite step.
However, I do think there are some minor benefits:
what similar transforms have people had in mind when pointing to it as an example?
I think LINQ is a big one just because it makes writing your code so much easier, but it can also slow your code down if not down carefully.
I think auto-vectorization and auto-parallelization would be other similar transformations (just thinking of the more complex optimizations a native compiler might do). I think both of these are generally considered machine-independent (but of course, there are exceptions).
@tannergooding
I don't (personally) think there is any major benefit to having the transforms roslyn-linq-rewrite
You're so wrong here :) It has huge benefit.
I think LINQ is a big one just because it makes writing your code so much easier, but it can also slow your code down if not down carefully.
It slows down your code whenever you care about it or not. Because it involves tons of delegate callbacks instead of pure imperative code. If you are working with DB then you don't care because you already are on slow path, but in-memory transformations via linq are so sweet and this slow too, up to 100 times IIRC. For example, in my current project I use LINQ 1733 times in 8051 files. That seems to be a lot.
It has huge benefit
Well, what is that benefit? That was the question.
@mikedn see Steno project, I don't know. MS research:
Steno project, I don't know. MS research, if you know:
The question was not about the effect of the optimization. I don't think anybody questions the fact that LINQ is not exactly efficient and that replacing the zillion of calls and allocations it generates would speed things up.
The question it about the various optimization approaches. Roslyn rewriter. IL rewriter. And, why not, even "JIT rewriter".
@mikedn
System.Enumerable
and all this stuff. It's ok for a custom analyzer, but it's not for general purpose compiler.So I understand correctly you prefer the Roslyn approach. Not because you actually need such optimizations to be implemented this way but because the other approaches seem more complicated.
Makes sense but at the same time it means that this implementation is tied to Roslyn and thus available to C# and VB only. Other languages will have to do their own thing.
The way I see it is
.NET IL Linker is an example of linking/tree-shaking with whole program analysis.
.NET IL Linker is an example of linking/tree-shaking with whole program analysis.
Yes, exactly. My team is looking at expanding the rewrites available in .NET IL Linker. I'm currently trying to assess what rewrites people are interested in having made available, for planning/prioritization purposes, which naturally brought me to this issue where there's been much discussion of that. LINQ rewriting seems to generate a lot of interest, and for the various reasons already mentioned above doesn't seem like it will have a home in the JIT or in Roslyn proper (where by "proper" I mean excluding opt-in extensions). So that would make it a good candidate, except that AFAIK anybody who would benefit from having it available to opt into in .NET IL Linker could just as well opt into it by adding roslyn-linq-rewrite to their build process. But I've been wondering if I'm somehow glossing over something with that line of reasoning, hence my questions about it. The takeaway I'm getting from the responses is that no, there isn't any benefit to LINQ rewriting in .NET IL Linker over what's already available via roslyn-linq-rewrite, and that LINQ rewriting has come up on this thread simply as an example of something useful that has been done that doesn't fit in the JIT or Roslyn proper.
anybody who would benefit from having it available to opt into in .NET IL Linker could just as well opt into it by adding roslyn-linq-rewrite to their build process.
The difference is credibility. roslyn-linq-rewrite is one-man project, last updated one year ago. It is a custom build of Roslyn compiler. It is hard to use for any serious project in the current form.
and for the various reasons already mentioned above doesn't seem like it will have a home in the JIT
Come to think of it, the real reason such optimizations may not belong in the JIT hasn't been mentioned. These LINQ "optimizations" aren't optimizations in the true sense, those rewriters don't understand the System.Linq code and optimize it, they assume that LINQ's methods do certain things and generate code that supposedly behaves identical. That is, they treat those methods as intrinsics.
That's something that the JIT could probably do too, except it requires generating significant amounts of IR and that may be cumbersome.
But probably the main problem with doing this in the JIT is that, for better or worse, there's not a single JIT. There's RyuJIT, there's .NETNative, there's Mono... Sheesh, one way or another some duplicate work will happen.
I'm currently trying to assess what rewrites people are interested in having made available, for planning/prioritization purposes, which naturally brought me to this issue where there's been much discussion of that
So where's IL linker's repository? Let people create issues, discuss, vote on them etc. That's better than "hijacking" an existing thread like this. Granted, you may end up with a bunch of noise but that's life.
Here's a fancy idea. What if the IL linker would CSE and hoist everything it can and then the JIT would do some kind of rematerialization to account for target architecture realities? Perhaps it would be cheaper for the JIT to do that instead of CSE. Granted... that's a bit beyond the idea of "linker" :)
That's something that the JIT could probably do too, except it requires generating significant amounts of IR and that may be cumbersome.
I believe that it is hard for the more interesting Linq optimizations to preserve all side-effects. It should not matter for well-written Linq queries, but it is a problem for poorly written Linq queries.
It is ok for a opt-in built-time tool to change behavior of poorly written Linq queries. It is not ok for JIT to do it at runtime.
So where's IL linker's repository?
https://github.com/mono/linker
Let people create issues, discuss, vote on them etc. That's better than "hijacking" an existing thread like this
Yes, I completely agree it will be more productive to discuss potential new rewrites over there, with separate issues for each. I wasn't trying to extend general discussion on this thread, just ask a clarifying question about a few specific comments on it (to which I now have the answer, thanks @jkotas).
I think the conclusion here is that such work would not likely be part of Roslyn, but more likely in https://github.com/mono/linker . However, we'll leave this open so people can find this.
But in this case it's tied to mono, isn't it? What's about core, full framework etc?
It is not tied to mono. https://github.com/dotnet/announcements/issues/30
The jit can only apply so many optimizations as it is time constrained at runtime.
AOT/NGen which has more time ends up with asm but loses some optimizations the jit can do at runtime as it needs to be conservative as to cpu architecture; static readonly to consts etc
The compile to il compilers (C#/VB.NET/etc); which aren't as time constrained but have a lot of optimizations considered out of scope.
Do we need a 3rd compiler between roslyn and jit that optimizes the il as part of the regular compile or a "publish" compile?
This could be a collaboration between the jit and roslyn teams?
I'm sure there are lots of low hanging fruit between the two; that the jit would like to do but are too expensive.
There is also the whole program optimization or linker + tree shaking which is also a component in this picture. (e.g. Mono/Xamarin linker). Likely also partial linking (e.g. nugets/non-platform runtime libs)
From https://github.com/dotnet/roslyn/issues/15644#issuecomment-267370590
/cc @migueldeicaza @gafter @jkotas @AndyAyersMS