Closed eclickECNU closed 11 months ago
Hey, thanks for the report. I assume you're referring to the reserved memory, which is basically the same for both experiments. I'm not really sure how to explain this. Unfortunately, I could not run your script currently (GCP is out of GPUs in my region). I ran a similar script using a vision transformer model and there, the reserved memory was smaller with LoRA.
In general, I find it strange that the reserved memory is so high to begin with compared to allocated memory. Do you know if that is expected?
I'm not entirely sure if this is normal, but my program has exhibited this behavior from the beginning. I'm very confused as well. After running LoRA, the allocated memory does show a decrease(from 9.2GB to 6.6GB), but overall, the GPU memory remains almost unchanged(nearly 70GB)
Hey, thanks for the report. I assume you're referring to the reserved memory, which is basically the same for both experiments. I'm not really sure how to explain this. Unfortunately, I could not run your script currently (GCP is out of GPUs in my region). I ran a similar script using a vision transformer model and there, the reserved memory was smaller with LoRA.
In general, I find it strange that the reserved memory is so high to begin with compared to allocated memory. Do you know if that is expected?
Do you have any special settings on the machine that could influence how much memory PyTorch reserves? Some custom settings for PYTORCH_CUDA_ALLOC_CONF
? Otherwise, I'm at my wit's end. @younesbelkada Have you ever seen this behavior with huge amounts of reserved memory?
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.
System Info
Who can help?
@BenjaminBossan
Information
Tasks
examples
folderReproduction
I have observed that when using LoRa with VisionEncoderDecoderModel, there is no significant change in GPU memory. The code I am running is as follows:
Expected behavior
In my experiments:
After checking the GPU memory using
nvidia-smi
, I noticed that there is almost no change in the overall GPU memory. How can I address this issue?