Open ronawho opened 6 years ago
My vote: deprecate and remove the deadlock detection because it doesn't always work (limited to single-node with tasks=fifo and even that isn't dependable) and we've clearly demonstrated a lack of resources to make it work. But retain the feature that provides a task status report on CTRL-C/SIGINT and make it work for qthreads tasking. The latter feature is very nearly as good for finding deadlocks, because the programmer can often arrange to know when a program has deadlocked or livelocked and manually interrupt it at that point. But it doesn't have the scalability and correctness problems of trying to do autonomous lack-of-progress detection.
Though I like these features when they work, I don't object to retiring the current implementation as being too weak to be useful. If also like Greg's proposal to preserve/improve the Ctrl-C/SIGINT behavior.
I updated the title and description to just talk about removing --blockreport
, and I have been convinced that --taskreport
is worthwhile.
https://github.com/chapel-lang/chapel/pull/10749 notested another deadlock test
--blockreport
is an "experimental capability for tracking the status of tasks, primarily designed for use in a single-locale execution". The docs might have a little more info, but basically blockreport attempts to tell you what tasks are blocked on sync vars (and there's some really broken auto-detect deadlock support.)These "capabilities" are only supported under fifo (the very limited support for qthreads was removed in https://github.com/chapel-lang/chapel/pull/10073) and even under fifo the functionality is limited to single-locale execution, and even for our very basic test cases it's not reliable all the time, which has caused us to disable some tests -- https://github.com/chapel-lang/chapel/pull/7255, https://github.com/chapel-lang/chapel/pull/7601
I believe we should deprecate support for blockreport for 1.18, and remove any implementation support for them shortly after.
I understand the desire for
--blockreport
and something like it has been requested in https://github.com/chapel-lang/chapel/issues/9721, but I think our current implementation is practically useless in its current state (and it predated the introduction of atomics, which are used much more frequently than syncs.)I'll note that I don't think this has been a huge maintenance burden for us, I'd really just like to get these flags out of any docs and removed from the
--help
output of all programs and as we have time we can remove the implementation support in fifo.