huggingface transformer running on CPU behind celery/redis doens't work (but works by itself)

HodorTheCoder commented 4 years ago

Hello,

I am actually creating this for posterity because it took me a day to figure it out and if anybody else has this issue, hopefully this helps.

I am running a Bert2Bert EncoderDecoderModel inside a docker container, running behind celery that is getting jobs through redis. This is in a production test environment on a machine w/o a GPU, so yes, it's slow, but it's not a deal breaker.

Anyways-- testing and everything works great when it's by itself. However, when I put it behind celery within a task, it would load the model and then get to generate some text and just hang. I couldn't figure out what the problem was until I found this thread:

https://github.com/celery/celery/issues/4113

The issue is how the CPU version of the model does forking-- the default celery configuration breaks unless you add the following to your celery config when running celery:

--pool=solo

Setting this fixes the concurrency issues with forking and everything works. So, it's a configuration issue.

Go forth and prosper.

HodorTheCoder commented 4 years ago

Anwer is included in the main post. Thanks.

floschne commented 2 years ago

Another way that "fixed" the problem for me was setting the number of torch threads to 1: torch.set_num_threads(1) BEFORE loading the model in the worker.

mumtazcem commented 2 years ago

Thanks, man, it took me a day to find your comment as well. 😆

huggingface / transformers

huggingface transformer running on CPU behind celery/redis doens't work (but works by itself) #7516