Quuxplusone / LLVMBugzillaTest

0 stars 0 forks source link

[Subreg liveness] Enabling subreg liveness increases spilling significantly #40312

Open Quuxplusone opened 5 years ago

Quuxplusone commented 5 years ago
Bugzilla Link PR41342
Status NEW
Importance P enhancement
Reported by Jonas Paulsson (paulsson@linux.vnet.ibm.com)
Reported on 2019-04-02 02:24:06 -0700
Last modified on 2019-04-04 14:40:21 -0700
Version trunk
Hardware PC Linux
CC kparzysz@quicinc.com, llvm-bugs@lists.llvm.org, paulsson@linux.vnet.ibm.com, quentin.colombet@gmail.com, uweigand@de.ibm.com
Fixed by commit(s)
Attachments decompress.ll (341651 bytes, text/plain)
tc_decompress_reduced.ll (16411 bytes, text/plain)
Blocks
Blocked by
See also
Created attachment 21715
unreduced test case, llc input

I found that the bzip2 benchmark regressed slightly when enabling subreg
liveness due to one particular function (decompress). It seems that for some
reason subreg liveness increases spilling instead of decreasing it:

./bin/llc -mtriple=s390x-linux-gnu -mcpu=z13 ./decompress.ll -o out.s --stats
|& grep spill
   1 regalloc              - Number of rematerialized defs for spilling
   9 regalloc              - Number of spilled snippets
  30 regalloc              - Number of spill slots allocated
  79 regalloc              - Number of spilled live ranges
 128 regalloc              - Number of spills inserted
  21 regalloc              - Number of spills removed

./bin/llc -mtriple=s390x-linux-gnu -mcpu=z13 ./decompress.ll -o out.subr.s  -
systemz-subreg-liveness --stats |& grep spill
   1 regalloc              - Number of rematerialized defs for spilling
  25 regalloc              - Number of spilled snippets
  27 regalloc              - Number of spill slots allocated
 168 regalloc              - Number of spilled live ranges
 173 regalloc              - Number of spills inserted
 149 regalloc              - Number of spills removed

I made a reduced test case also from this file:

./bin/llc -mtriple=s390x-linux-gnu -mcpu=z13 ./tc_decompress_reduced.ll -o
out.subr.s  --stats |& grep spill
  9 regalloc              - Number of spill slots allocated
  9 regalloc              - Number of spilled live ranges
 10 regalloc              - Number of spills inserted

./bin/llc -mtriple=s390x-linux-gnu -mcpu=z13 ./tc_decompress_reduced.ll -o
out.subr.s  -systemz-subreg-liveness --stats |& grep spill
  2 regalloc              - Number of spilled snippets
 11 regalloc              - Number of spill slots allocated
 11 regalloc              - Number of spilled live ranges
 13 regalloc              - Number of spills inserted

I wonder if anyone might be able to help me in understanding why this happens?
Quuxplusone commented 5 years ago

Attached decompress.ll (341651 bytes, text/plain): unreduced test case, llc input

Quuxplusone commented 5 years ago

Attached tc_decompress_reduced.ll (16411 bytes, text/plain): reduced testcase

Quuxplusone commented 5 years ago

Is the spill cost bigger?

Essentially what I am wondering is whether we don't end up inserting more instructions but at cheaper places, which would be expected.

Quuxplusone commented 5 years ago

(In particular, the number of spill slots is close in both cases so I am guessing/hoping this is just better placement.)