Open josharian opened 4 years ago
A line-numbering hack is to remember which ones were "lost" in a block, then do a second pass, looking for an "eligible" new home. Eligible means that it hits certain requirements for Op type, and ties are broken by "has no inputs from same line" (i.e., the "first" instruction in the line). Ineligible ops are encoded in isPoorStatementOp(op)
.
If there's no replacement in the same block, other heuristics might help. If the eliminated instruction has an eligible input on the same line (this should not be common, goal of line-numbering heuristics is to find the first "real" Op on a line), go there. Otherwise if blocks are in flow order, a successor might contain a new home, so carry that it forward.
eliminate only values that have v.Uses == 0
Some optimization rules result in dangling values without any uses. Having a cheaper deadcode that ran as part of the rewrite passes would be very handy for helping v.Uses == 1
constraints fire reliably, and should allow us to drop the clobberIfDead
function that is currently used as a workaround for this issue (usually to allow unaligned load/store ops to be matched).
A little bit ago I hacked a CL that eliminates Uses==0 values during rewrite. I needed it to force some Uses==1 rules to trigger. Unfortunately, it caused the assembler to enter an infinite loop. Not sure what that was about, and never tracked it down.
I suspect this would end up improving both compiled code quality and compiler performance.
It is definitely true. If there is a cheaper DCE, then we can call it in more places where it is needed and more optimization opportunities will appear. For example, the fuse pass, if we can clean up dead blocks in time, we can handle more situations.
Change https://golang.org/cl/249463 mentions this issue: cmd/compile: invalidate zero-use values during rewrite
There are a number of places in the SSA pipeline where it would produce better code if we could run deadcode. However, deadcode is expensive.
I suspect we could create a cheaper, only slightly less effective version of deadcode:
One complication is line number handling, which currently depends on an ordering map which is created while building the full value liveness graph. I am not sure how best to address this; I find the existing code confusing. Perhaps @dr2chase has ideas.
We could use it in more places; we could probably also replace many uses of the full deadcode pass with it. I suspect this would end up improving both compiled code quality and compiler performance.
This is a reminder issue to investigate this. It's probably not a good introduction to the compiler, but help is definitely welcome!