Closed tomzawadzki closed 7 months ago
Another instance of this failure. Reported by @mmkayPL. log: https://ci.spdk.io/public_build/autotest-per-patch_116029.html
Hi, here is a quick update on the issue
I'm able to reproduce very similar behavior on my setup just by running test/bdev/blockdev.sh nvme
and, most of the time, when failover happens, some of the further tests will fail. The probable reason for failure is a second reset that occurs during failover, and it's not recognized by the caller (the reset test ends with a callback sent after the first reset is done), so tests are continued even if the second reset is still in progress. I was not able to reproduce access to null ptr in accel_sequence_complete_tasks, but I have observed other issues reported by asan probably due to releasing qpair while resetting.
Patch ready for review: https://review.spdk.io/gerrit/c/spdk/spdk/+/21901
CI Intermittent Failure
This failure occurs intermittently once ASAN is enabled on
nvme-phy-autotest
job.ASAN points to
accel_sequence_complete_tasks()
, but the issue might lie with failover being triggered as well:Another log from
crypto-phy-autotest
would confirm that the issue might the due to failover.Wasn't able to reproduce it on my own system and after adding more logging (here) the issue did not yet reproduce.
Link to the failed CI build
https://ci.spdk.io/results/autotest-per-patch/builds/113176/archive/nvme-phy-autotest_57397/build.log https://ci.spdk.io/results/autotest-per-patch/builds/113351/archive/crypto-phy-autotest_11625/build.log
Execution failed at