Open nephatrine opened 2 years ago
I suspect that this is not an actual bug, but a consequence of pthread's optimization, see: https://stackoverflow.com/a/68514238
I had also come across a similar error when running my application under valgrind, but after some research I came to the conclusion that it's a harmless "leak". You can probably circumvent it by having your process sleep for a little while before termination. If you believe that your issue is unrelated to this, feel free to re-open this issue.
It's not just a consequence of optimization, the lock is literally right before the destroy call in the code. No amount of sleeping will change that.
/* Because we allocate pool->threads after initializing the
mutex and condition variable, we're sure they're
initialized. Let's lock the mutex just in case. */
pthread_mutex_lock(&(pool->lock));
pthread_mutex_destroy(&(pool->lock));
https://linux.die.net/man/3/pthread_mutex_destroy https://pubs.opengroup.org/onlinepubs/9699919799/functions/pthread_mutex_destroy.html
Attempting to destroy a locked mutex results in undefined behavior.
I'd imagine that this is actually "harmless" in practice, but for what reason is it being locked before being destroyed? If we need that lock there, then just inserting an unlock immediately afterwards would suffice to comply with the POSIX standard. The only concern I see is if you think something else could lock it between the unlock and the destroy, but if you've got other threads still trying to use the mutex at that point then you've got bigger issues.
I'll take a better look at this once I get home, I may have mixed up your issue with something else I was having. My bad for just assuming.
It's not just a consequence of optimization, the lock is literally right before the destroy call in the code. No amount of sleeping will change that.
In the meantime, can you confirm that changing the ordering fixes the issue? If so, could you please open a PR fixing it?
Just to give a bit of a context, the issue arises from how this third-party threadpool lib manages its clean up. We can either look for an alternative (C99 compliant) threadpool implementation, write our own, or fix this one (and send a fix PR to the original's out of courtesy).
Describe the bug
When I call discord_cleanup in a build using TSAN, I get diagnostic messages stating a locked mutex is being destroyed.
Looking in threadpool.c and the threadpool_free function it seems you are intentionally locking the mutex just prior to deletion. I don't think this impacts the program's functionality in any way, but it is undefined behaviour according to the spec.
Expected behavior
If the lock is just to make sure anything else that could have been using it has terminated, then you should be able to add in an unlock immediately after the lock and prior to the destroy. If there is a concern that something else could try to initiate a new lock on the mutex at this point in the code where you're destroying it, then I'd imagine that is a completely separate concurrency issue that would need to be resolved elsewhere.
Version
commit 1da91a6641f668042fecd4a318923087bdb87739 (HEAD, tag: v2.1.0)
Distributor ID: Ubuntu Description: Ubuntu 22.04.1 LTS Release: 22.04 Codename: jammy
Additional context
I am not attempting to debug or QA concord's threading implementation. I am merely debugging my own threaded code in my library that uses concord and was very confused upon seeing these diagnostics. The similar orca library has the same "issue" fyi and given that it does not seem to actually have any negative impact aside from that noise in tsan when closing my application, it's understandable if you elect to just leave it as it is.