Open shunghsiyu opened 4 years ago
I doubt that this will ever get fixed, the timing are just too unpredictable and rebooting or even loading from snapshot would make the testrun way too slow.
I guess that the best we can do is to keep the dmesg in one big file and only store line offset to the relevant part in the testcase.
We could try waiting for the system load to decrease below some threshold before continuing. However it seems like some of the metrics for measuring load are broken on the low latency scheduler on my machine so this could be tricky.
I guess you can also compare the PIDs in the stack trace with the reproducer's.
I decide to take the crazy idea of rebooting after every single reproducer run to see how painful vs useful it is.
Any suggestion for the runltp-ng option name if the crazy idea is useful enough for a PR? (--always-reboot
?)
I'd prefer the more generic --always-revert
because of #23
Sometimes printk messages do not immediately show up when the reproducer runs, and are only emitted when the next reproducer starts to run.
In the example below, we can see the backtrace for previous reproducer
PID: 22329 Comm: 7e93129ee310878
showed up in the result of7e7470f0058e0950706a4fd684c2d86c7b43df31
.This is a hard one to solve because there may be task related to last reproducer lingering around somewhere (e.g. inside work queues).
Some ideas for this issue:
dmesg
to flush the printk buffer (not sure if this will work)