Closed HodorTheCoder closed 4 years ago
Anwer is included in the main post. Thanks.
Another way that "fixed" the problem for me was setting the number of torch threads to 1: torch.set_num_threads(1) BEFORE loading the model in the worker.
Thanks, man, it took me a day to find your comment as well. 😆
Hello,
I am actually creating this for posterity because it took me a day to figure it out and if anybody else has this issue, hopefully this helps.
I am running a Bert2Bert EncoderDecoderModel inside a docker container, running behind celery that is getting jobs through redis. This is in a production test environment on a machine w/o a GPU, so yes, it's slow, but it's not a deal breaker.
Anyways-- testing and everything works great when it's by itself. However, when I put it behind celery within a task, it would load the model and then get to generate some text and just hang. I couldn't figure out what the problem was until I found this thread:
https://github.com/celery/celery/issues/4113
The issue is how the CPU version of the model does forking-- the default celery configuration breaks unless you add the following to your celery config when running celery:
--pool=solo
Setting this fixes the concurrency issues with forking and everything works. So, it's a configuration issue.
Go forth and prosper.