Open wks opened 7 months ago
If a thread is waiting on a lock for more than X seconds/minutes in ART, it panics and dies. Perhaps we need something similar.
https://github.com/mmtk/mmtk-jikesrvm/actions/runs/9124315467/job/25088273084?pr=172
In this test run, JikesRVM hung for 35 minutes without making progress while running lusearch with RFastAdaptiveMarkSweep. There is no indication if it hung during GC, but it is very likely.
We have recently observed some bugs causing tests to hang while doing GC. For example
Given that a typical GC shouldn't take more than a few seconds, there should be some watch dog mechanism so that the process can panic and printing the stack trace of all threads.
Watch dog is also valuable for real-world applications, especially for mobile applications. If an application is unresponsive, the OS will try to restart it, or notify the user for further actions.