dotnet / runtime

.NET is a cross-platform runtime for cloud, mobile, desktop, and IoT apps.
https://docs.microsoft.com/dotnet/core/
MIT License
15.47k stars 4.77k forks source link

JitInterface: distinguish among various classes of noinline #6249

Open AndyAyersMS opened 8 years ago

AndyAyersMS commented 8 years ago

CORINFO_FLG_DONT_INLINE covers a number of possibilities:

Looking forward, it would be good to distinguish among these since future code generators may have different inline capabilities and different assessments of profitablility.

category:cq theme:inlining skill-level:expert cost:medium

mikedn commented 8 years ago

Does this include CORINFO_FLG_BAD_INLINEE? A CORINFO_FLG_NO_RETURN could be handy :)

AndyAyersMS commented 8 years ago

I am skeptical that we can propagate the no return information back up into the runtime metadata like we do for no inline. The acid test here is that the jit can only incorporate knowledge gleaned from analyzing the callee if it can inline the callee. The can inline test is both (root) caller and callee sensitive. So it's not dependent only on the callee.

For no inline we can be a bit lax and accept some false positives, since at worst we'll miss out on an optimization opportunity. But for no return we have to be sound and if we claim a method is no return it needs to be no return.

mikedn commented 8 years ago

Sorry but I don't understand. If the callee is once determined to be "no return" then it's always "no return" not matter who the caller is. AFAIK at the time the "no return" information is determined no optimizations have taken place that could transform the callee in such a way that it is caller specific.

AndyAyersMS commented 8 years ago

One scenario where this comes up is servicing.

Suppose we have a method X in some assembly A and a caller AX in the same assembly. In some other assembly B we have another caller, BX.

When crossgenning A, the jit compiles AX. During this compilation the jit queries the runtime and does other checks and deduces that X can be inlined. It then goes to look at X's IL and deduces that it does not return, and it optimizes AX based on this deduction. No problem so far.

When crossgenning B, the jit compiles BX. Here it sees another call to X. If the jit relies only on the earlier no return deduction to optimize this call, then the crossgenned code for BX (and hence, B) is now dependent on the version of X in A that was present during crossgen.

Because of servicing, if A and B are in different "servicing bubbles," the version of A seen at runtime when BX runs may be different than the version of A seen when BX was crossgenned. For instance the runtime version of A may have an updated X that conditionally throws or never throws, etc.

The canInline check, among other things, determines if caller and callee are in the same serving bubble. So to be safe, when compiling BX, the jit must perform the can inline check for X, and if it fails, it cannot safely use the earlier no return deduction for X.

R2R (the on-by default "non-fragile" version of crossgen) is careful not to bake in these kinds of dependencies, precisely to enable servicing of this sort.

mikedn commented 8 years ago

I see, I didn't realize that this flag is persisted, I thought that it's only "in memory" and it would affect only AX.

Still, it seems that this could be used conservatively. It works for AX, it works for any method that calls X and is JITed at runtime, we just need to ignore the stored "no return" information when crossgening B.

That said, it maybe too much trouble to implement this. I already took a quick look at this and it seems that there's not much room left for additional flags to begin with.

AndyAyersMS commented 7 years ago

If by "additional flags" you mean the backing field for MethodDescClassification then yeah, all 16 bits appear to be in use...

mikedn commented 7 years ago

If by "additional flags" you mean the backing field for MethodDescClassification then yeah, all 16 bits appear to be in use...

Yep, MethodDescClassification as stored in MethodDesc::m_wFlags member. Though I haven't checked if there isn't any spare room in MethodDesc and NDirectMethodDesc that would allow more information to be stored without increasing the struct size...

AndyAyersMS commented 7 years ago

For CoreCLR, we can probably repurpose some of the bits in in either m_bflags2 or in m_wFlags that deal with CAS. If so, that gives us at least 4 bits to use, which are currently

noahfalk commented 7 years ago

To the best of my knowledge we aren't using the transparency bits either? One thing to be wary of, Profiler IL instrumentation and Edit and Continue both provide ways to alter the IL for a method after it was jitted and jit it again. We'll need to take care that cached computed properties of the IL aren't being inappropriately applied to new IL versions. ENC disables all the JIT optimizations and the JIT doesn't inline profiler modified IL from rejit, so both may wind up being moot in this case. I raise it only as a general consideration for this type of caching.

jkotas commented 7 years ago

If so, that gives us at least 4 bits to use, which are current

I have cleaned up a bunch of CAS related code. These bits are explicitly marked as unused now.

AndyAyersMS commented 4 years ago

It would still be useful to have this; with the advent of PGO we may now be using a different inline heuristic at runtime than we use when prejitting, and so might be willing to reconsider a "not profitable" decision made during prejitting.

Doesn't make the cut for 6.0 unless we start finding compelling examples.