Open 3rdAT opened 3 days ago
transformers
@ArthurZucker @gante
examples
When I perform inference with two GPUs using the following command,
CUDA_VISIBLE_DEVICES=0,1 nohup python ./inference.py
the model generates answer properly.
Whereas, when I use more than two GPUs using the following command,
CUDA_VISIBLE_DEVICES=0,1,3,4 nohup python ./inference.py
the model starts generating Gibberish. Upon close introspection, the model outputs logits which are all NaN values.
Note: I use device_map = "auto" while loading the model.
I expect the model to generate properly.
System Info
transformers
version: 4.40.2Who can help?
@ArthurZucker @gante
Information
Tasks
examples
folder (such as GLUE/SQuAD, ...)Reproduction
When I perform inference with two GPUs using the following command,
CUDA_VISIBLE_DEVICES=0,1 nohup python ./inference.py
the model generates answer properly.
Whereas, when I use more than two GPUs using the following command,
CUDA_VISIBLE_DEVICES=0,1,3,4 nohup python ./inference.py
the model starts generating Gibberish. Upon close introspection, the model outputs logits which are all NaN values.
Note: I use device_map = "auto" while loading the model.
Expected behavior
I expect the model to generate properly.