Open sandreenko opened 5 years ago
I would like to continue working on this case, could you give me some advice on how to move forward, @AndyAyersMS @CarolEidt @dotnet/jit-contrib?
Compiler::optCopyProp looks like the best candidate to handle this extra move, but currently, it doesn't work because:
It does, though this might be a special case that I haven't noticed before - the source of the copy is a PHI.
but it would be declined, because 000361 is long and 000081 is int, so copyProp ends on:
Ah, that's weird. I suppose the variable type is actually long and the JIT retypes the LCL_VAR node to avoid inserting a cast. A bad idea IMO. Granted, adding back the cast won't actually fix the problem since this is no longer a copy. Though then the cast could perhaps be made contained before codegen (since such a narrowing cast is basically a no-op).
Though I suppose you may as well modify if (op->TypeGet() != tree->TypeGet())
to recognize this particular pattern of long->int reinterpretation. When in Rome...
@sandreenko I take it you are no longer actively working on this? Seems like we should move it to future.
I agree with that.
ML.Net code has several places where we do
a = a * const_int;
, for example,MurmurHash
has 6imul
instructions in the final asm for x64 https://github.com/dotnet/machinelearning/blob/b861b5d64841cbe0f2c866ee7586872aac450a51/src/Microsoft.ML.Core/Utilities/Hashing.cs#L118 and in some cases, we do them with one extra mov:instead of
the non-optimal codegen happens when we inline
MurmurRound
and create a temp LCL_VAR forhash
argument:hash = MurmurRound(hash, (uint)len);
IR looks like:
and we want to get rid of
STMT00030
.I have thought about 3 possible places where it could be done:
Compiler::fgInlinePrependStatements
, https://github.com/dotnet/coreclr/blob/master/src/jit/flowgraph.cpp#L23243.I have tried all of them and did not get a good result,
3:
fgInlinePrependStatements
already can replace an argument that was single used with the original tree, I was able to teach it to replace an argument that was originally a lcl_var with this lcl_var loads (instead of creating a new one), but only if the argument was not modified inside the inline method (not our case). That means it supports cases likewe can support defs if we now that
inline myMethod(lclVar0);
is the last use of lclVar0 (our case, because we havehash = call(hash)
), but it happens before we generate live information, so we don't know thatcall(lclVar0)
is the last use oflclVar0
.2:
ContainCheckMul
set contained onIsContainableMemoryOp
, so it doesn't support moves from one register to another, forcing it to set contained on[000077]
gave me many asserts.1:
Compiler::optCopyProp
looks like the best candidate to handle this extra move, but currently, it doesn't work because: 1.1[000361] D------N---- +--* LCL_VAR long V04 loc1 d:3
doesn't have a VN pair, because it is a phi statement that is processed here: https://github.com/dotnet/coreclr/blob/c8ad76dd8169238c085ee6e3f03d074aed4b76b2/src/jit/valuenum.cpp#L5885-L5890 and there we don't setVNPair
for the tree, socopyProp
ends on: https://github.com/dotnet/coreclr/blob/c8ad76dd8169238c085ee6e3f03d074aed4b76b2/src/jit/copyprop.cpp#L203-L207if we change that and assign a VNPair for
[000361]
then we would consider it as a candidate for[000081] D------N---- +--* LCL_VAR int V08 tmp1 d:3 $286
replacement, but it would be declined, because000361
is long and000081
is int, socopyProp
ends on: https://github.com/dotnet/coreclr/blob/c8ad76dd8169238c085ee6e3f03d074aed4b76b2/src/jit/copyprop.cpp#L208-L211 If we fix that we will still have different VN values so the copy propagation won't happen: https://github.com/dotnet/coreclr/blob/c8ad76dd8169238c085ee6e3f03d074aed4b76b2/src/jit/copyprop.cpp#L212-L215but if somehow we skip these checks (manually in a debugger for example) and do the propagation, then we have asm that we want without any asserts in later stages.
Note: the moves are cheap but there are many of them so I expect it to give us at least measurable code size improvement.
category:cq theme:basic-cq skill-level:intermediate cost:medium impact:medium