Open Terrencezzj opened 5 days ago
Can you take a look at this @SunMarc or @MekkCyber ?
v4.43.4 doesn't have the issue
Hi @LysandreJik , this potential feature regression in Transformers have caused issues in our library (NVIDIA TensorRT Model Optimizer) when our users want to run quantization on multi-GPUs. Currently, our users need to revert back to Transformers v4.43.4. We appreciate if you can help prioritize this. Thanks!
Hi @hchings @Terrencezzj, thanks for specifying that v4.43.4 does work, I am looking into the problem
Hi, I found that there was no accelerate config. You can create an accelerate config file with multigpu config and try running the below hoping you loaded the model and tokenizer before.
Refer here to create config.yaml file to support multiGPU Inference https://huggingface.co/docs/accelerate/v1.1.0/en/package_reference/accelerator#accelerate.Accelerator https://huggingface.co/docs/accelerate/v1.1.0/en/package_reference/utilities#accelerate.DistributedType
`accelerator= Accelerator()
model = accelerator.prepare(model)
input_ids = tokenizer.encode("Any Context", return_tensors="pt").to(accelerator.device)
gen_tokens = model.generate(input_ids)`
Hi, I found that there was no accelerate config. You can create an accelerate config file with multigpu config and try running the below hoping you loaded the model and tokenizer before.
Refer here to create config.yaml file to support multiGPU Inference https://huggingface.co/docs/accelerate/v1.1.0/en/package_reference/accelerator#accelerate.Accelerator https://huggingface.co/docs/accelerate/v1.1.0/en/package_reference/utilities#accelerate.DistributedType
`accelerator= Accelerator() model = accelerator.prepare(model) input_ids = tokenizer.encode("Any Context", return_tensors="pt").to(accelerator.device) gen_tokens = model.generate(input_ids)`
Hi, I got the same error with Accelerator. Please noticed that my script was not inference, it's model(input_ids)
System Info
transformers
version: 4.47.0.dev0Who can help?
No response
Information
Tasks
examples
folder (such as GLUE/SQuAD, ...)Reproduction
Expected behavior
If set one gpu visible, no error