Closed lhames closed 16 years ago
Thanks! I've applied your patch. http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20081013/068635.html
Fix for bug 2898 - defers dead store removal caused by re-uses until validity of re-use is confirmed. This patch has been tested on SingleSource and MultiSource and does not seem to introduce any new errors on AMD64. As yet I haven't worked out whether deferring the removal of dead store causes could cause other optimisation opportunities to be missed, but I see no reason (yet) that it should.
Failure bitcode from bugpoint. You'll need to compile this using
llc -regalloc=pbqp -o bugpoint.test.bc.s bugpoint.test.bc
to reproduce the bug.
Please attach a bc file. Thanks.
Extended Description
The ldecod benchmark (MultiSource/Applications/JM/ldecod) is miscompiled for AMD64. I believe this is due to a bug in LocalSpiller where certain stores are assumed to be "dead", and thus are removed, when they are in fact still necessary.
The problem is exposed when the PBQP allocator is used on AMD64. It can be demonstrated by adding -regalloc=pbqp to LLCBETA_OPTIONS, and running, from projects/test-suite/MultiSource/Applications/JM/ldecod/Output,
make ENABLE_LLC=1 ENABLE_LLCBETA=1
The resulting program segfaults when run on the test input:
bash-3.2$ ./Output/ldecod.llc-beta -i data/test.264 -o Output/test_dec.yuv -r data/test_rec.yuv
----------------------------- JM 12.1 (FRExt) ----------------------------- Decoder config file : (null)
Input H.264 bitstream : data/test.264 Output decoded YUV : Output/test_dec.yuv Output status file : log.dec Input reference file : data/test_rec.yuv
POC must = frame# or field# for SNRs to be correct
Frame POC Pic# QP SnrY SnrU SnrV Y:U:V Time(ms)
Segmentation fault
The test case can be reduced with the following:
bugpoint -run-llc ldecod.llvm.bc --args -i ../data/test.264 -r ../data/test_rec.yuv --tool-args -regalloc=pbqp
Faulting basic block is itrans8x8_bb15_bb15_2E_ce.
After register allocation, before spilling bb15.ce contains the following two instructions (among many others):
reg167, reg1170 and reg1171 are spill intervals. Pertinent allocations are: reg1170 = R8, reg1171 = R9, reg1089 = R10, reg 1167 = R8D
During spilling by LocalSpiller the following problem occurs:
Store: MOV64mr <fi#10>, 1, %reg0, 0, %R8
Remembering SS#10 in physreg R8
MOV64mr <fi#10>, 1, %reg0, 0, %R8
Reusing SS#10 from physreg R8 for vreg1171 instead of reloading into physreg R9
Removed dead store: MOV64mr <fi#10>, 1, %reg0, 0, %R8
PhysReg R9 clobbered, invalidating SS#9
PhysReg R8 clobbered, invalidating SS#10
Remembering SS#10 in physreg R9
%R9 = MOV64rm <fi#10>, 1, %reg0, 0
Reuse undone!
Remembering SS#8 in physreg R8D
%R8D = MOV32rm <fi#8>, 1, %reg0, 0
MOV32mr %R9, 4, %R10, 0, %R8D, Mem:ST(4,4) [0x15c0628 + 0]
In attempting to Reuse SS#10 in physreg R8 for the second instruction the store of reg1170/R8 is treated as dead by the block starting it VirtRegMap.cpp:1441. When it is discovered that R8 is clobbered by a later parameter the load into R9 is re-instated, but the store is not, leading to junk in R9. In general I do not think dead stores cannot be eliminated until the validity of the re-uses are confirmed.