Closed Feoramund closed 3 days ago
I think the MacOS Intel test is stalling on the thread pool termination in the demo, if I had to take a guess.
I think the MacOS Intel test is stalling on the thread pool termination in the demo, if I had to take a guess.
hmm, now we know why it was "unreliable" I guess? Maybe just on Intel? It works perfectly on my/ci ARM
Although I do wonder if this is a 100% thing or a failure that we sometimes have in ci
This idea might be starting to creep beyond the scope of this PR, but I think thread
either needs a new API for threads to call to check a cancelation point to satisfy pthread (a no-op on other platforms), or thread termination should be defined as forceful termination and not cancellation, possibly with a split in the API, one proc for terminate/kill (immediate) and one for cancel (negotiated).
I imagine forceful termination for relevant platforms could be handled as a thread-specific signal handler in the thread preamble, with thread.terminate
sending a signal to that thread, similar to how it's done in the test runner except not process-wide.
This idea might be starting to creep beyond the scope of this PR, but I think
thread
either needs a new API for threads to call to check a cancelation point to satisfy pthread (a no-op on other platforms), or thread termination should be defined as forceful termination and not cancellation, possibly with a split in the API, one proc for terminate/kill (immediate) and one for cancel (negotiated).I imagine forceful termination for relevant platforms could be handled as a thread-specific signal handler in the thread preamble, with
thread.terminate
sending a signal to that thread, similar to how it's done in the test runner except not process-wide.
Yeah this is starting to get into things the thread rewrite will probably cover. I just hoped this was a quick fix we could add for now so I/macos users can run tests without pulling their hair out after a panic.
I just hoped this was a quick fix we could add for now so I/macos users can run tests without pulling their hair out after a panic.
I hoped it was going to be quick too, but it looks like we stumbled into the mire on this one.
Hmm there are some other when statements regarding cancelling further up in thread_unix, these probably need to be removed too, maybe it will work then?
Otherwise, yeah, out of (quick) ideas, thanks for the quick response and PR though, at least it works locally on my end.
Hmm there are some other when statements regarding cancelling further up in thread_unix, these probably need to be removed too, maybe it will work then?
You're right. Well, it's worth a shot. I've pushed again, so we'll see in a few minutes.
Hmm there are some other when statements regarding cancelling further up in thread_unix, these probably need to be removed too, maybe it will work then?
You're right. Well, it's worth a shot. I've pushed again, so we'll see in a few minutes.
Looks like it worked and is ready to go now! iMO a great change pending the thread rewrite.
I made sure that even with the signal handler catching
SIGTRAP
that I was able to catch a call tointrinsics.debug_trap()
withlldb
and continue on merrily without the runner taking control.This PR also lets Darwin call
pthread_cancel
again, with an advisory comment based on my experiences working with FreeBSD. It is possible that the previous unreliable behavior was based off of a lack of cancelation points being used.Fixes #3830
EDIT: One moment while I look up what
SIGTRAP
is on Windows...