Quuxplusone / LLVMBugzillaTest

0 stars 0 forks source link

Miscompilation after r355962: [SROA] Fix a crash when trying to convert a memset to an non-integral pointer #42496

Open Quuxplusone opened 5 years ago

Quuxplusone commented 5 years ago
Bugzilla Link PR43526
Status NEW
Importance P enhancement
Reported by dmajor (dmajor@bugmail.cc)
Reported on 2019-10-01 08:41:57 -0700
Last modified on 2019-10-03 12:43:41 -0700
Version unspecified
Hardware PC Windows NT
CC efriedma@quicinc.com, htmldeveloper@gmail.com, listmail@philipreames.com, llvm-bugs@lists.llvm.org
Fixed by commit(s)
Attachments miscompile.txt (3466 bytes, text/plain)
Blocks PR43360
Blocked by
See also
Created attachment 22613
Miscompilation analysis

After moving Firefox builds to clang 9.0.0, some of our tests failed because of
a miscompilation that began in r355962 according to a bisect.

Unfortunately I don't have a small repro and I'm not sure it's feasible to
produce one. The affected code is part of a huge binary that is built with LTO
and PGO, and we only see the issue on a single platform (Android ARM32) that is
difficult to build and debug locally.

I'm attaching a file containing a colleague's analysis of the generated code
near the point of failure (from
https://bugzilla.mozilla.org/show_bug.cgi?id=1583907#c17). A value that we
needed was placed into r0, but then r0 was immediately overwritten by something
else.

@reames, is there any chance you might be able to have a look at the analysis
and intuit what went wrong? I'm not sure what other hope we have of debugging
this.
Quuxplusone commented 5 years ago

Attached miscompile.txt (3466 bytes, text/plain): Miscompilation analysis

Quuxplusone commented 5 years ago

There's a strong possibility that the patch itself is fine, and you're finding some unrelated miscompile. You could try dumping the "-mllvm -debug-only=sroa" log before/after the patch, to check if there's some difference that's obviously wrong, I guess.

It should be possible to extract a small reproduction... although we don't have great tooling to do that with ThinLTO at the moment. You can use lld -save-temps to get the bitcode and object files for each ThinLTO unit, find the relevant file, do whatever transform on it, and link against the other ELF files dumped by -save-temps. (It's not obvious if the bad transform here is happening at ThinLTO time, or during bitcode generation, though.)

https://llvm.org/docs/OptBisect.html is generally useful for debugging configurations which are hard to break into separate steps. But there isn't any way to make it properly bisect a ThinLTO pipeline at the moment. (Someone should probably look into implementing that.)

Quuxplusone commented 5 years ago

I've attempted to gather the various bits of information that have been requested, but it's turning out to be too much of a pain. As I've mentioned this is an especially frustrating build configuration, and the builds don't even run on my machine, I have to send them off to a remote server.

I had been hoping that a problem might be obvious from inspection of the codegen (sometimes this works, so it's worth trying) but if we need to dig deeper then there isn't much hope. Maybe we or someone else will hit this again on a more easily debuggable build configuration.