Closed Quuxplusone closed 14 years ago
Attached bugpoint-tooptimize.bc
(32680 bytes, application/octet-stream): bugpoint-tooptimize.bc
Attached bugpoint-tonotoptimize.bc
(585036 bytes, application/octet-stream): bugpoint-tonotoptimize.bc
Attached ocamlrun.bc
(573388 bytes, application/octet-stream): ocamlrun.bc
At first glance the transform is correct, but maybe I am missing something.
If I look at the asm diff:
- movq intern_src(%rip), %rax
- leaq 2(%rax), %rcx
- movq %rcx, intern_src(%rip)
- movzbl 1(%rax), %ecx
- movsbq (%rax), %rax
- shlq $8, %rax
- orq %rcx, %rax
- leaq 1(,%rax,2), %rax
- movq %rax, (%rdi)
+ addq $2, intern_src(%rip)
+ movq $-1, (%rdi)
It is kind of hard to follow, but where does the -1 constant come from??
It is certainly not in the bitcode output from instcombine.
So maybe this is a backend problem, and not an instcombine problem?
FWIW if I run the bitcode output from -instcombine through the C backend, and
gcc -O3 I get:
intern_rec_sw_2E_bb125:
.LFB16:
movq intern_src(%rip), %rdx
leaq 2(%rdx), %rax
movq %rax, intern_src(%rip)
movsbq (%rdx),%rax
movzbl 1(%rdx), %edx
salq $8, %rax
orq %rdx, %rax
addq %rax, %rax
orq $1, %rax
movq %rax, (%rdi)
ret
And here is what llc generates:
intern_rec_sw_2E_bb125:
.LBB1_0: # newFuncRoot
.LBB1_1: # sw.bb125
addq $2, intern_src(%rip)
movq $-1, (%rdi)
.LBB1_2: # sw.epilog.exitStub
ret
That $-1 looks bogus.
Attached optimized.bc
(32612 bytes, application/octet-stream): optimized.bc
Hmm, llc thinks the optimized bitcode has undefined behaviour, does it ?!
Replacing.3 0x2a69110: i64 = srl 0x2a3a4e8, 0x2a69208
With: 0x2a3a3f0: i64 = undef
Replacing.6 0x2a3a4e8: i64,ch = load 0x2a3a2f8, 0x2a39f18, 0x2a3a3f0
<0x2a2ed68:0> <anyext i8> alignment=1
With chain: 0x2a3a2f8: ch = store 0x2a39f18:1, 0x2a3a010, 0x2a3a108, 0x2a3a3f0
<0x29cdf58:0> alignment=8
Replacing.3 0x2a3a7d0: i64 = sign_extend_inreg 0x2a3a3f0, 0x2a3a6d8
With: 0x2a3a3f0: i64 = undef
Replacing.3 0x2a69300: i64 = or 0x2a3a3f0, 0x2a69cb0
With: 0x2a3a6d8: i64 = Constant <-1>
valgrind gives some warning when llc is run with -debug:
==32222== Conditional jump or move depends on uninitialised value(s)
==32222== at 0x4CC1BE6: std::ostreambuf_iterator<char,
std::char_traits<char> > std::num_put<char, std::ostreambuf_iterator<char,
std::char_traits<char> > >::_M_insert_int<long>(std::ostreambuf_iterator<char,
std::char_traits<char> >, std::ios_base&, char, long) const (in
/usr/lib/libstdc++.so.6.0.12)
==32222== by 0x4CC1E65: std::num_put<char, std::ostreambuf_iterator<char,
std::char_traits<char> > >::do_put(std::ostreambuf_iterator<char,
std::char_traits<char> >, std::ios_base&, char, long) const (in
/usr/lib/libstdc++.so.6.0.12)
==32222== by 0x4CD1B8D: std::ostream& std::ostream::_M_insert<long>(long)
(in /usr/lib/libstdc++.so.6.0.12)
==32222== by 0x90B6DF: llvm::BaseStream<std::ostream>&
llvm::BaseStream<std::ostream>::operator<< <int>(int const&) (Streams.h:57)
==32222== by 0xC0DD88: (anonymous namespace)::X86ISelAddressMode::dump()
(X86ISelDAGToDAG.cpp:93)
==32222== by 0xC0E4D5: (anonymous
namespace)::X86DAGToDAGISel::MatchAddress(llvm::SDValue, (anonymous
namespace)::X86ISelAddressMode&, unsigned int) (X86ISelDAGToDAG.cpp:742)
==32222== by 0xC12668: (anonymous
namespace)::X86DAGToDAGISel::SelectAddr(llvm::SDValue, llvm::SDValue,
llvm::SDValue&, llvm::SDValue&, llvm::SDValue&, llvm::SDValue&, llvm::SDValue&)
(X86ISelDAGToDAG.cpp:1165)
==32222== by 0xCD9F7D: (anonymous
namespace)::X86DAGToDAGISel::Select_ISD_STORE(llvm::SDValue const&)
(X86GenDAGISel.inc:40475)
==32222== by 0xD2705C: (anonymous
namespace)::X86DAGToDAGISel::SelectCode(llvm::SDValue) (X86GenDAGISel.inc:55147)
==32222== by 0xD289D3: (anonymous
namespace)::X86DAGToDAGISel::Select(llvm::SDValue) (X86ISelDAGToDAG.cpp:1673)
==32222== by 0xD2BE5C: (anonymous
namespace)::X86DAGToDAGISel::SelectRoot(llvm::SelectionDAG&)
(DAGISelHeader.h:112)
==32222== by 0xD2D3B4: (anonymous
namespace)::X86DAGToDAGISel::InstructionSelect() (X86ISelDAGToDAG.cpp:624)
==32222== Use of uninitialised value of size 8
==32222== at 0x4CBB863: (within /usr/lib/libstdc++.so.6.0.12)
==32222== by 0x4CC1C12: std::ostreambuf_iterator<char,
std::char_traits<char> > std::num_put<char, std::ostreambuf_iterator<char,
std::char_traits<char> > >::_M_insert_int<long>(std::ostreambuf_iterator<char,
std::char_traits<char> >, std::ios_base&, char, long) const (in
/usr/lib/libstdc++.so.6.0.12)
==32222== by 0x4CC1E65: std::num_put<char, std::ostreambuf_iterator<char,
std::char_traits<char> > >::do_put(std::ostreambuf_iterator<char,
std::char_traits<char> >, std::ios_base&, char, long) const (in
/usr/lib/libstdc++.so.6.0.12)
==32222== by 0x4CD1B8D: std::ostream& std::ostream::_M_insert<long>(long)
(in /usr/lib/libstdc++.so.6.0.12)
==32222== by 0x90B6DF: llvm::BaseStream<std::ostream>&
llvm::BaseStream<std::ostream>::operator<< <int>(int const&) (Streams.h:57)
==32222== by 0xC0DD88: (anonymous namespace)::X86ISelAddressMode::dump()
(X86ISelDAGToDAG.cpp:93)
==32222== by 0xC0E4D5: (anonymous
namespace)::X86DAGToDAGISel::MatchAddress(llvm::SDValue, (anonymous
namespace)::X86ISelAddressMode&, unsigned int) (X86ISelDAGToDAG.cpp:742)
==32222== by 0xC12668: (anonymous
namespace)::X86DAGToDAGISel::SelectAddr(llvm::SDValue, llvm::SDValue,
llvm::SDValue&, llvm::SDValue&, llvm::SDValue&, llvm::SDValue&, llvm::SDValue&)
(X86ISelDAGToDAG.cpp:1165)
==32222== by 0xCD9F7D: (anonymous
namespace)::X86DAGToDAGISel::Select_ISD_STORE(llvm::SDValue const&)
(X86GenDAGISel.inc:40475)
==32222== by 0xD2705C: (anonymous
namespace)::X86DAGToDAGISel::SelectCode(llvm::SDValue) (X86GenDAGISel.inc:55147)
==32222== by 0xD289D3: (anonymous
namespace)::X86DAGToDAGISel::Select(llvm::SDValue) (X86ISelDAGToDAG.cpp:1673)
==32222== by 0xD2BE5C: (anonymous
namespace)::X86DAGToDAGISel::SelectRoot(llvm::SelectionDAG&)
(DAGISelHeader.h:112)
Fix committed here, the bug was a negative shiftamount created by dagcombiner,
and subsequently eliminated as undefined:
http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20090518/077932.html
ocamlbyterun works correctly now with -O1.
bugpoint-tooptimize.bc
(32680 bytes, application/octet-stream)bugpoint-tonotoptimize.bc
(585036 bytes, application/octet-stream)ocamlrun.bc
(573388 bytes, application/octet-stream)optimized.bc
(32612 bytes, application/octet-stream)