nalepae / pandarallel

A simple and efficient tool to parallelize Pandas operations on all available CPUs
https://nalepae.github.io/pandarallel
BSD 3-Clause "New" or "Revised" License
3.59k stars 208 forks source link

RuntimeError: Cannot re-initialize CUDA in forked subprocess. #272

Open riyajatar37003 opened 3 weeks ago

riyajatar37003 commented 3 weeks ago

i am using transformer model to generate embeddings inside a function and that function is apply on each row of dataframe using parallel_apply which throwing belwo error

RuntimeError: Cannot re-initialize CUDA in forked subprocess. To use CUDA with multiprocessing, you must use the 'spawn' start method """

The above exception was the direct cause of the following exception:

Traceback (most recent call last): File "/app/snow.atg_arch_only.home/users/ariyaz/AI_Search/ais_ml_embedding_eval/ais_ml/tevatron/hnmine/negative-mine-v2.py", line 93, in dataframe["output"] = dataframe.parallel_apply(lambda row: retriever(row,index,model, File "/tmp/.local/lib/python3.10/site-packages/pandarallel/core.py", line 333, in closure results_promise.get() File "/opt/conda/lib/python3.10/multiprocessing/pool.py", line 774, in get raise self._value RuntimeError: Cannot re-initialize CUDA in forked subprocess. To use CUDA with multiprocessing, you must use the 'spawn' start method