Proposal: IL optimization step

benaadams commented 7 years ago

The jit can only apply so many optimizations as it is time constrained at runtime.

AOT/NGen which has more time ends up with asm but loses some optimizations the jit can do at runtime as it needs to be conservative as to cpu architecture; static readonly to consts etc

The compile to il compilers (C#/VB.NET/etc); which aren't as time constrained but have a lot of optimizations considered out of scope.

Do we need a 3rd compiler between roslyn and jit that optimizes the il as part of the regular compile or a "publish" compile?

This could be a collaboration between the jit and roslyn teams?

I'm sure there are lots of low hanging fruit between the two; that the jit would like to do but are too expensive.

There is also the whole program optimization or linker + tree shaking which is also a component in this picture. (e.g. Mono/Xamarin linker). Likely also partial linking (e.g. nugets/non-platform runtime libs)

From https://github.com/dotnet/roslyn/issues/15644#issuecomment-267370590

/cc @migueldeicaza @gafter @jkotas @AndyAyersMS

JosephTremoulet commented 7 years ago

Yes, I also realize that an IL rewrite step needn't be limited to things that are in scope for roslyn-linq-rewrite. This thread has many specific suggestions of those. I am trying to ask a question specifically about the linq-related suggestions that have been made on this thread, please don't interpret it as a statement on anything beyond that. Since my question keeps getting buried by answers to a more general one that wasn't what I was trying to ask, I'll repeat it: for the sort of transforms that roslyn-linq-rewrite performs, is there a benefit/desire to have an IL rewrite step perform those same ones? If so, why, and if not, what similar transforms have people had in mind when pointing to it as an example?

tannergooding commented 7 years ago

Sorry. I had misunderstood your question originally, my bad 😄 (if I still managed to derail your question below, just ping me and I'll remove).

for the sort of transforms that roslyn-linq-rewrite performs, is there a benefit/desire to have an IL rewrite step perform those same ones?

I don't (personally) think there is any major benefit to having the transforms roslyn-linq-rewrite performs also duplicated in an IL rewrite step.

However, I do think there are some minor benefits:

You only have to run one tool, instead of 2 or 3
Hypothetically, there could be some roslyn-linq-rewrite transformations that are only possible after some other optimization or analysis has been done first. Having all the logic in a single tool would make that easier

what similar transforms have people had in mind when pointing to it as an example?

I think LINQ is a big one just because it makes writing your code so much easier, but it can also slow your code down if not down carefully.

I think auto-vectorization and auto-parallelization would be other similar transformations (just thinking of the more complex optimizations a native compiler might do). I think both of these are generally considered machine-independent (but of course, there are exceptions).

Pzixel commented 7 years ago

@tannergooding

I don't (personally) think there is any major benefit to having the transforms roslyn-linq-rewrite

You're so wrong here :) It has huge benefit.

I think LINQ is a big one just because it makes writing your code so much easier, but it can also slow your code down if not down carefully.

It slows down your code whenever you care about it or not. Because it involves tons of delegate callbacks instead of pure imperative code. If you are working with DB then you don't care because you already are on slow path, but in-memory transformations via linq are so sweet and this slow too, up to 100 times IIRC. For example, in my current project I use LINQ 1733 times in 8051 files. That seems to be a lot.

mikedn commented 7 years ago

It has huge benefit

Well, what is that benefit? That was the question.

Pzixel commented 7 years ago

@mikedn see Steno project, I don't know. MS research:

mikedn commented 7 years ago

Steno project, I don't know. MS research, if you know:

The question was not about the effect of the optimization. I don't think anybody questions the fact that LINQ is not exactly efficient and that replacing the zillion of calls and allocations it generates would speed things up.

The question it about the various optimization approaches. Roslyn rewriter. IL rewriter. And, why not, even "JIT rewriter".

Pzixel commented 7 years ago

@mikedn

IL rewriter - very complicated, and provide almost same possibilities as Roslyn. See Code Contracts library. Why restore code trees manually if we already have a Roslyn? And if we do, why not perform it at compile time instead of transforming stuff back and forth?
Roslyn - is guaranteed to generate valid IL, much more user-frienly, has already an Analyzer API and going to provide replace/original API which is ideal for this purpose.
JIT doesn't perform much simpler optimizations so I don't expect its developers even bother to implement something similar. Another the problem is that in this case compiler should know about System.Enumerable and all this stuff. It's ok for a custom analyzer, but it's not for general purpose compiler.

mikedn commented 7 years ago

So I understand correctly you prefer the Roslyn approach. Not because you actually need such optimizations to be implemented this way but because the other approaches seem more complicated.

Makes sense but at the same time it means that this implementation is tied to Roslyn and thus available to C# and VB only. Other languages will have to do their own thing.

benaadams commented 7 years ago

The way I see it is

JIT - Can optimize beyond what is expressible in IL; however is more focused on per function optimizations (always applied)
Roslyn - converts C# and VB to verifiable IL; some optimization in release (always applied)
AoT IL rewriter - opt-in configuration; whole program optimizations; linking/tree-shaking; transforms than may produce unverifiable il; function splitting (inlinable fast-path; non-inlining code path) etc

.NET IL Linker is an example of linking/tree-shaking with whole program analysis.

JosephTremoulet commented 7 years ago

.NET IL Linker is an example of linking/tree-shaking with whole program analysis.

Yes, exactly. My team is looking at expanding the rewrites available in .NET IL Linker. I'm currently trying to assess what rewrites people are interested in having made available, for planning/prioritization purposes, which naturally brought me to this issue where there's been much discussion of that. LINQ rewriting seems to generate a lot of interest, and for the various reasons already mentioned above doesn't seem like it will have a home in the JIT or in Roslyn proper (where by "proper" I mean excluding opt-in extensions). So that would make it a good candidate, except that AFAIK anybody who would benefit from having it available to opt into in .NET IL Linker could just as well opt into it by adding roslyn-linq-rewrite to their build process. But I've been wondering if I'm somehow glossing over something with that line of reasoning, hence my questions about it. The takeaway I'm getting from the responses is that no, there isn't any benefit to LINQ rewriting in .NET IL Linker over what's already available via roslyn-linq-rewrite, and that LINQ rewriting has come up on this thread simply as an example of something useful that has been done that doesn't fit in the JIT or Roslyn proper.

jkotas commented 7 years ago

anybody who would benefit from having it available to opt into in .NET IL Linker could just as well opt into it by adding roslyn-linq-rewrite to their build process.

The difference is credibility. roslyn-linq-rewrite is one-man project, last updated one year ago. It is a custom build of Roslyn compiler. It is hard to use for any serious project in the current form.

mikedn commented 7 years ago

and for the various reasons already mentioned above doesn't seem like it will have a home in the JIT

Come to think of it, the real reason such optimizations may not belong in the JIT hasn't been mentioned. These LINQ "optimizations" aren't optimizations in the true sense, those rewriters don't understand the System.Linq code and optimize it, they assume that LINQ's methods do certain things and generate code that supposedly behaves identical. That is, they treat those methods as intrinsics.

That's something that the JIT could probably do too, except it requires generating significant amounts of IR and that may be cumbersome.

But probably the main problem with doing this in the JIT is that, for better or worse, there's not a single JIT. There's RyuJIT, there's .NETNative, there's Mono... Sheesh, one way or another some duplicate work will happen.

I'm currently trying to assess what rewrites people are interested in having made available, for planning/prioritization purposes, which naturally brought me to this issue where there's been much discussion of that

So where's IL linker's repository? Let people create issues, discuss, vote on them etc. That's better than "hijacking" an existing thread like this. Granted, you may end up with a bunch of noise but that's life.

Here's a fancy idea. What if the IL linker would CSE and hoist everything it can and then the JIT would do some kind of rematerialization to account for target architecture realities? Perhaps it would be cheaper for the JIT to do that instead of CSE. Granted... that's a bit beyond the idea of "linker" :)

jkotas commented 7 years ago

That's something that the JIT could probably do too, except it requires generating significant amounts of IR and that may be cumbersome.

I believe that it is hard for the more interesting Linq optimizations to preserve all side-effects. It should not matter for well-written Linq queries, but it is a problem for poorly written Linq queries.

It is ok for a opt-in built-time tool to change behavior of poorly written Linq queries. It is not ok for JIT to do it at runtime.

JosephTremoulet commented 7 years ago

So where's IL linker's repository?

https://github.com/mono/linker

Let people create issues, discuss, vote on them etc. That's better than "hijacking" an existing thread like this

Yes, I completely agree it will be more productive to discuss potential new rewrites over there, with separate issues for each. I wasn't trying to extend general discussion on this thread, just ask a clarifying question about a few specific comments on it (to which I now have the answer, thanks @jkotas).

gafter commented 6 years ago

I think the conclusion here is that such work would not likely be part of Roslyn, but more likely in https://github.com/mono/linker . However, we'll leave this open so people can find this.

Pzixel commented 6 years ago

But in this case it's tied to mono, isn't it? What's about core, full framework etc?

jkotas commented 6 years ago

It is not tied to mono. https://github.com/dotnet/announcements/issues/30

dotnet / roslyn

Proposal: IL optimization step #15929