I'm trying to run Docker on 2 A16 GPUS using model_id "google/gemma-2b". But after the model downloading step I run into AssertionError like the following.
Expected behavior
When I run with only 1 GPU it can initialize just fine. This issue only happen when I try to use multigpu.
System Info
Information
Tasks
Reproduction
I'm trying to run Docker on 2 A16 GPUS using model_id "google/gemma-2b". But after the model downloading step I run into AssertionError like the following.
Expected behavior
When I run with only 1 GPU it can initialize just fine. This issue only happen when I try to use multigpu.