Closed A-Debord closed 6 years ago
Try moving condition.notify_one();
under the lock.
@wilx Moving notify_one()
inside the locked scope shouldn't fix anything.
It's entirely valid and should probably be assumed most efficient to notify not holding the lock. Notification when holding the lock may put the notified thread into contention for the lock. That contention is usually satisfied very quickly but it risks waking a thread, putting it immediately to sleep and waking it again moments later.
This thread is on an old version of this thread and there are a few better versions after this.
It's hard to say what is really causing issue here without seeing a complete verifiable program. Fragments don't tell the full story.
Also note that std::cout
isn't thread-safe and you should put a lock around it.
I'm not saying that is the problem but you might be losing logged messages.
It's none too pretty to introduce a std::mutex
over std::cout
because all the locking and unlocking makes the threads more contending and hence sequential and the implicit memory barriers can hide issues in the 'non-debug' code.
While I have personally said I think the wakeups cannot be missed (https://github.com/progschj/ThreadPool/issues/34#issuecomment-266973470), I am not sure any more. In fact, I did do this change in my modifed version of the thread pool (https://github.com/log4cplus/ThreadPool/commit/f690bf8dd9409aadcc3f35e6d5b50ef41eb28143) because I had similar issues, IIRC.
@wilx All the documentation and all the advice is that you do not need to be (and should probably avoid) holding the lock to notify but must be holding the lock to update the conditional variables (i.e. what the predicate depends on).
http://en.cppreference.com/w/cpp/thread/condition_variable
I don't know what this problem is but barring an appalling bug in the library that really really should not be the problem.
I've let my program run all the day. It did not freeze. However yesterday with the same ThreadPool code, the deadlock occurred multiple times. The only difference I see is that yesterday the program was recompiled between each launch (because I was working on it). And today I put it in an infinite loop.
So, to me, it has something to do with the compilation procedure (I'm using O3 flag btw). I will investigate further this way.
Here is the exact code, maybe you'll see something :
threadpool.h
class ThreadPool {
public:
ThreadPool(size_t);
template<class F, class... Args>
auto queue(F&& f, Args&&... args)
-> std::future<typename std::result_of<F(Args...)>::type>;
~ThreadPool();
private:
void worker(int i);
bool _stop;
std::vector< std::thread > _workers;
std::queue<std::function<void()> > _tasks;
std::mutex _tasks_mutex;
std::condition_variable _messenger;
};
template<class F, class... Args>
auto ThreadPool::queue(F&& f, Args&&... args)
-> std::future<typename std::result_of<F(Args...)>::type>
{
using return_type = typename std::result_of<F(Args...)>::type;
auto task = std::make_shared< std::packaged_task<return_type()> >(
std::bind(std::forward<F>(f), std::forward<Args>(args)...)
);
std::future<return_type> res = task->get_future();
cout << "enqueue waiting for lock" << endl;
{
cout << "enqueue locked" << endl;
std::unique_lock<std::mutex> lock(_tasks_mutex);
_tasks.emplace([task](){ (*task)(); });
cout << "enqueue will unlock" << endl;
}
cout << "enqueue will notify" << endl;
_messenger.notify_one();
cout << "enqueue notified" << endl;
return res;
}
threadpool.cpp
ThreadPool::ThreadPool(size_t threads):_stop(false)
{
for(size_t i = 0;i<threads;++i){
_workers.push_back(std::thread(&ThreadPool::worker,this,i));
}
}
ThreadPool::~ThreadPool()
{
{std::unique_lock<std::mutex> lock(_tasks_mutex);
_stop = true;
}
_messenger.notify_all();
for(size_t i = 0;i<_workers.size();++i)
_workers[i].join();
}
void ThreadPool::worker(int i)
{
while(true)
{
std::function<void()> task;
cout << "worker "<<i<<" waiting for lock" <<endl;
{
std::unique_lock<std::mutex> lock(this->_tasks_mutex);
cout << "worker "<<i<<" waiting" << endl;
this->_messenger.wait(lock,
[this]{ cout << "message recieved " << (this->_stop || !this->_tasks.empty()) << endl;
return this->_stop || !this->_tasks.empty(); });
cout << "worker "<<i<<" awoken" << endl;
if(this->_stop && this->_tasks.empty())
return;
task = std::move(this->_tasks.front());
this->_tasks.pop();
}
cout << "worker "<<i<<" released"<<endl;
task();
}
}
I can see nothing wrong here.
std::unique_lock<>
is entirely valid but you might consider the even lighter weight std::lock_guard<>
where you aren't implicitly (or explicitly) releasing the lock such as queue
and ~ThreadPool
@A-Debord Try to get call stacks of all threads if you ever see this deadlock again.
I am having the same issue. Here is what I think is happening:
If you add tasks at a slower rate than the pool is consuming them, everything will be fine.
If at any point, you add a task while all of the workers are busy, then the .notify_one() call will not trigger any .wait()ing workers. You now have asymmetry, and there will be items left in the queue. Adding new tasks will allow the old to be processed, but you'll have a queue that keeps growing every time a .notify_one() call falls on deaf ears.
std::condition_variable does not enqueue notify*() calls.
Perhaps I'm missing something. Correct me if I'm wrong.
@malamanteau commented on Jan 6, 2018, 9:55 PM GMT+1:
I am having the same issue. Here is what I think is happening:
If you add tasks at a slower rate than the pool is consuming them, everything will be fine.
If at any point, you add a task while all of the workers are busy, then the .notify_one() call will not trigger any .wait()ing workers. You now have asymmetry, and there will be items left in the queue. Adding new tasks will allow the old to be processed, but you'll have a queue that keeps growing every time a .notify_one() call falls on deaf ears.
std::condition_variable does not enqueue notify*() calls.
Perhaps I'm missing something. Correct me if I'm wrong.
This does not seem a plausible explanation. If the workers are busy and none is wait()ing then you indeed miss a notification. But that should not be a problem because once any worker gets out of task() call, it will lock mutex again and re-check the condition which will make it avoid sleeping on the CV and instead do another iteration with another task.
I have done some digging and this might be related, if you are using this on Linux/Glibc:
Rather interesting read.
So, assuming this might be related to problems mentioned here by OP and @malamanteau, my next question is what is your Glibc version, assuming you are running this on Linux? It seems some issues were fixed by version 2.25.
Also, I have asked a question about how atomic is pthread_cond_wait()
on which the C++11 construct is based on on Linux. It turns out there are corner cases that could affect us, I think: c - atomicity of pthread_cond_wait()'s unlock-and-wait? - Stack Overflow
So, I am inclined to suggest, again, move all of the notifications under held mutex section.
@wilx Indeed, this is running on Linux/Glibc 2.24.
I'm still parsing this, but wow, this is very interesting!
** facepalm My issue was elsewhere. sorry guys. I had a spurious lock outside of the thread pool.
I think this issue should be closed as it's (probably) solved.
Hello,
Sometimes, after a worker has begun waiting, the other threads cannot lock the mutex anymore. I don't get why... This occurs very rarely, roughly once every 30000 call to enqueue(). It is however a serious issue for my use case.
To track the issue, I have added some prints in enqueue() and the worker :
Prints in enqueue :
Prints in the worker lambda :
When the deadlock happens I get :
enqueue waiting for lock enqueue locked enqueue will unlock enqueue will notify worker 5 awoken worker 5 released worker 3 waiting for lock worker 3 waiting enqueue notified enqueue waiting for lock enqueue locked enqueue will unlock enqueue will notify worker 6 waiting for lock worker 6 waiting worker 6 awoken worker 6 released worker 4 waiting for lock worker 4 waiting worker 5 waiting for lock worker 6 waiting for lock
Two things I don't get
I hope I don't give you too much of a headach Cheers