Closed diego-coba closed 2 months ago
cc @muellerzr @SunMarc
Why are we doing everything under with device()
? Does it work if you remove this?
Thanks for looking at my issue.
Q. Why? A. When working with the large variant of the model for prediction, PyTorch doesn't use the GPU, so I had to manually move it with .to('cuda'). To avoid having to move everything manually (tokenizer, dataset, model) I started using the with device syntax. Now I'm trying to train it using PEFT with LoRA, and as my GPU has only 4GB VRAM, I used the base variant this time, keeping the manual specification for the device to be used but the error shown happens.
Q: Does it work if I remove it: A: It actually works, even when setting the device to CPU, PyTorch somehow ignores it and can, with the base variant of the model, automatically use the GPU as saw in nvidia-smi using about 3.8 GB VRAM when running the script.
So IDK why sometimes PyTorch automatically uses the GPU, others not, but for some reason when trying to force it to use the GPU with PEFT LoRa, the error happens.
For now I'm just relying on the automatic device detection. But I still think there's something not working properly somewhere.
Thanks again @muellerzr
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.
Please note that issues that do not follow the contributing guidelines are likely to be ignored.
System Info
Transformers 4.41.2 PyTorch 2.3.1+cu121 Python 3.12.3 Ubuntu 24.04
GPU: NVIDIA GeForce GTX 1650
Who can help?
No response
Information
Tasks
examples
folder (such as GLUE/SQuAD, ...)Reproduction
`%pip install --quiet torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu121 %pip install --quiet -U datasets %pip install --quiet torchdata %pip install --quiet setuptools %pip install --quiet transformers %pip install --quiet evaluate %pip install --quiet rouge_score %pip install --quiet loralib %pip install --quiet peft %pip install --quiet ipywidgets
The code shown above throws Expected a 'cuda' device type for generator but found 'cpu'
Expected behavior
Should not throw the error as the entire code is running under "with torch.device(device):" with device='cuda'