asterisk / asterisk

The official Asterisk Project repository.
https://www.asterisk.org
Other
1.99k stars 929 forks source link

chan_console: Fix deadlock caused by unclean thread exit. #309

Closed InterLinked1 closed 8 months ago

InterLinked1 commented 8 months ago

To terminate a console channel, stop_stream causes pthread_cancel to make stream_monitor exit. However, commit 5b8fea93d106332bc0faa4b7fa8a6ea71e546cac added locking to this function which results in deadlock due to the stream_monitor thread being killed while it's holding the pvt lock.

To resolve this, a flag is now set and read to indicate abort, so the use of pthread_cancel and pthread_kill can be avoided altogether.

Resolves: #308

InterLinked1 commented 8 months ago

cherry-pick-to: 18 cherry-pick-to: 20 cherry-pick-to: 21

seanbright commented 8 months ago

Isn’t this what pthread_cleanup_push() is for? That would be cleaner IMO.

InterLinked1 commented 8 months ago

Isn’t this what pthread_cleanup_push() is for? That would be cleaner IMO.

Maybe cleaner than setting the cancellability, but it's a little messy and using pthread_cancel is never very "clean", in my opinion, so if it can be avoided that's probably more ideal.

I restored the pthread_kill so if the read is blocking for some reason, it should still wake up, and exit after it releases the mutex, and then the thread can join.

seanbright commented 8 months ago

using pthread_cancel is never very "clean", in my opinion

And why is that?

InterLinked1 commented 8 months ago

using pthread_cancel is never very "clean", in my opinion

And why is that?

Because the thread that's cancelled doesn't get to control how it exits. You can add a cleanup handler but now you'd probably need to keep track of whether it's holding a mutex, for example, and then release it only then.

This guy here has a good explanation that says it better than I can: https://stackoverflow.com/a/3438576

I subscribe to the theory that each thread should totally control its own resources, including its execution lifetime. I've always found that to be the best way to avoid deadlock.

pthread_cancel was fine before the commit that added locking, but now that there are resources fine, it's cleaner to not use pthread_cancel and have the thread control where it exits.

seanbright commented 8 months ago

Is Pa_ReadStream() a cancellation point? I don’t see how calling pthread_cancel() here could leave pvt locked.

InterLinked1 commented 8 months ago

Is Pa_ReadStream() a cancellation point? I don’t see how calling pthread_cancel() here could leave pvt locked.

Internally, probably (internally, I imagine it would call read somewhere), because it was exiting while holding the pvt lock. There's an example in the linked issue.

seanbright commented 8 months ago

Because the thread that's cancelled doesn't get to control how it exits.

Of course it does. pthread_testcancel() and pthread_cancel() are cooperative.

InterLinked1 commented 8 months ago

Because the thread that's cancelled doesn't get to control how it exits.

Of course it does. pthread_testcancel() and pthread_cancel() are cooperative.

Yes, but if PA_readstream internally calls pthread_testcancel, which I'm almost positive it does (perhaps via read or poll), then you're screwed because it will exit at the wrong cancellation point. That's why cancellability would need to be disabled while it's holding the mutex, to ensure it never exits in the middle.

I believe by default all threads are cancellable so without this there isn't a guarantee it won't be cancelled in the middle. I originally tested it this way before changing it to a flag. This should also work, if you strongly prefer this approach to setting a flag and not using pthread_cancel, I could change it back, they should both do the same thing.

seanbright commented 8 months ago

My preference would be to use cancellation but I am not writing the code. If you've confirmed this fixes the issue then it's fine.

github-actions[bot] commented 8 months ago

Successfully merged to branch master and cherry-picked to ["18","20","21"]