Open derekbruening opened 2 months ago
This code was recently changed by #6815 so it is possible this is an introduced regression.
The change in #6815 was to avoid running out of stack space by reusing the existing frame during native signal delivery for an almost-detached thread (which has only the removal of DR main_signal_handler left); this should actually make "sigaltstack too small in native thread" less likely as we're now using less stack space for native signal delivery.
We'll need more info on the exact sequence of events happening here.
Could this regression have something to do with https://github.com/DynamoRIO/dynamorio/pull/6868 merged 3 days ago?
I logged into the aarch64-precommit
machine but can't reproduce it there:
derek@dynamorio:~/dr/build$ ctest --repeat-until-fail 500 -R detach_signal
Test project /home/derek/dr/build
Start 351: code_api|api.detach_signal
Test #351: code_api|api.detach_signal ....... Passed 0.22 sec
Start 351: code_api|api.detach_signal
Test #351: code_api|api.detach_signal ....... Passed 0.21 sec
...
1/1 Test #351: code_api|api.detach_signal ....... Passed 0.19 sec
100% tests passed, 0 tests failed out of 1
Total Test time (real) = 104.28 sec
I don't have access to the aarch64-sve-precommit
machine which is where it failed. @AssadHashmi maybe you could run it on that machine 1000x and see if it reproduces? If so maybe removing #6868 and repeating would show whether that is the culprit?
This happened once on the aarch64-sve-precommit-256 test:
https://github.com/DynamoRIO/dynamorio/actions/runs/9914012718/job/27392240520?pr=6879
This code was recently changed by #6815 so it is possible this is an introduced regression.