Open Manishearth opened 4 years ago
Can you check what the .err files in /tmp/rr-test-*
say?
I can reproduce setuid-no-syscallbuf with ctest -R sometimes, I cannot reproduce the other failures.
You mean the setuid-no-syscallbuf test fails often, but the other tests fail very rarely?
Ryzen 3700x After following the setup instructions, I get 12 failures out of 2487 tests. Which is still great compared to before.
Summary
82 - clone_vfork_pidfd (Failed)
83 - clone_vfork_pidfd-no-syscallbuf (Failed)
920 - nested_detach_wait (Failed)
921 - nested_detach_wait-no-syscallbuf (Failed)
1140 - nested_detach (Failed)
1141 - nested_detach-no-syscallbuf (Failed)
1326 - clone_vfork_pidfd-32 (Failed)
1327 - clone_vfork_pidfd-32-no-syscallbuf (Failed)
2162 - nested_detach_wait-32 (Failed)
2163 - nested_detach_wait-32-no-syscallbuf (Failed)
2382 - nested_detach-32 (Failed)
2383 - nested_detach-32-no-syscallbuf (Failed)
These are the .err files of all the tests, I can provide the rest of files, but the tar would be too big to provide all at once. rr-tests.tar.gz
@v-lopez: Please file a separate issue for those. The clone_vfork_pidfd has a similar problem to what was fixed in 17aa8239c0a9ffd0e66623fc3627f664b384bf1e, and nested-detach has a different kind of assertion.
@Manishearth did setuid fail with something like the following?
[FATAL .../rr/src/Registers.cc:405:compare_register_files()]
(task 911147 (rec:857972) at time 365)
-> Assertion `!bail_error || match' failed to hold. Fatal register mismatch (ticks/rec:128273/128273)
On my end, with a 3990X, the setuid-no-syscallbuf test is failing (and that's the main one that I think I've seen fail with some repeated runs, though sometimes it doesn't fail), and the record.err says this:
[ERROR /home/pnkfelix/Dev/Mozilla/rr.git/src/Registers.cc:295:maybe_print_reg_mismatch()] r10 0x55a993c2e95a != 0x55a993c2e958 (replaying vs. recorded)
process 317197 sent SIGURG
I would expect that to be a duplicate of #2694. If you can pack and upload a trace I can verify whether or not the tracee is using RDRAND.
I assume the trace you want packed is the one in the same /tmp/rr-test-setuid-XXX
directory; I've put a tarball of that whole directory below.
rr-test-setuid.tar.gz (this wasn't what you asked for; see below.)
Oh, I'm sorry, you asked me to pack it, and I didn't realized that meant run rr pack
on it as described in #2694. I'll do that now.
Okay this tar ball has the packed version of the directory.
Unsupported instruction at 0x7f534449603f (opcode rdrand)
Can you replay the trace, hbreak *0x7f534449603f
in gdb. continue, and get a backtrace at that instruction?
So this confirms that my problem is a duplicate of issue #2694, since __getgrnam_r
appears in the backtrace, right?
Yup, it's the same thing in systemd (which is fixed upstream at systemd/systemd#17115)
As this is identified as both an upstream issue (systemd) and duplicate, can we close this?
Testing rr on Ryzen 3900X (after running the ryzen workaround) and I get the following failures:
I can reproduce
setuid-no-syscallbuf
withctest -R
sometimes, I cannot reproduce the other failures.I'm not sure if 3900X should be added to the list of supported Ryzen CPUs, are these known bits of flakiness?
cc @glandium