Open GoogleCodeExporter opened 9 years ago
issue 87 also talks about the CorruptionTest failing in the same way, but only
in a VM running CentOS, and possibly ubuntu on ARM. Are you running fedora in a
VM or on metal? Do you have a rough estimate of how many runs are necessary
until you see this issue?
I have not seen the issue reproduce on my ubuntu precise machine, after ~800
runs of corruption_test. Though I can possibly try on an ARM CrOS machine later.
Testing on Fedora 17 could be helpful if it's not too much trouble. If it
doesn't reproduce there we can begin to narrow down differences.
Could you follow up about the CompactionInputErrorParanoid bug in issue 87? We
can use this (issue 182) for the HiddenValuesAreRemoved bug, which I have not
seen a report for until now.
Original comment by dgrogan@chromium.org
on 27 Jun 2013 at 5:45
This is on metal. 2* 6 core Xeon. We saw multiple failures in 10 runs.
I'll test on Fedora 17 tomorrow. That's on a Core i7 though.
Original comment by fullung@gmail.com
on 27 Jun 2013 at 6:31
I suspect that there's a race condition in the test harness. If level 0 is not
compacted before the check it will fail.
We fixed this for our environment by adding sleep to the tests where
appropriate. A better fix would be to add some synchronization to wait for
compaction to finish.
Here's our code for reference (line 325/326):
https://github.com/rescrv/HyperLevelDB/blob/master/db/corruption_test.cc
I didn't consider upstreaming these changes until now because they only
manifested themselves after we separated the memtable compaction into another
thread.
Original comment by res...@gmail.com
on 27 Jun 2013 at 7:14
Okay, it seems the CompactionInputErrorParanoid puzzle is solved, so I'll focus
on gathering more information about HiddenValuesAreRemoved here.
Original comment by fullung@gmail.com
on 28 Jun 2013 at 4:06
Correction: turns out we also saw this test failure inside a VM.
All I can guess is that the VM changes up the timing in the tests.
Is there anything we can do to provide more information to debug this one?
Original comment by fullung@gmail.com
on 28 Jun 2013 at 2:09
I have been able to reproduce this problem and I am working on a fix. Thanks
for the reports.
Original comment by san...@google.com
on 1 Jul 2013 at 9:31
Original issue reported on code.google.com by
fullung@gmail.com
on 27 Jun 2013 at 8:07