OOM error - Githubissues

bialykostek commented 1 week ago

Hi! Thanks for the repo, but unfortunatelly i get CUDA out of memory error when i try to run fine tunning.

OutOfMemoryError: CUDA out of memory. Tried to allocate 224.00 MiB. GPU 0 has a total capacity of 14.75 GiB of which 7.06 MiB is free. Process 130036 has 14.74 GiB memory in use. Of the allocated memory 14.04 GiB is allocated by PyTorch, and 580.33 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True to avoid fragmentation. See documentation for Memory Management (https://pytorch.org/docs/stable/notes/cuda.html#environment-variables)

I've tested it in Collab (L4), as well as on my local machine (RTX 3090) - same result. Can you provide information, what device (or devises) did you use for fine tuning and how much VRAM was required? Maybe some library causes memory leak

Farzad-R commented 1 week ago

Hi,

Your GPUs do not have enough memory for this task. I'm not sure if you have watched the video in which I explain this project in detail. There I explained the minimum GPU that you need for fine-tuning LLAVA. Please check out 06:21 in this video:

https://www.youtube.com/watch?v=0pd1ZDT--mU&t=381s

bialykostek commented 1 week ago

Thank you for good explanation! I didn't know about the video - you should link it to README ;)

Farzad-R commented 1 week ago

you're welcome! I thought I did it. Thanks for the notice. I just updated the readme file and added the video. I'll close this issue then. But in case something else comes up please feel free to either continue the conversation or open a new one.

Farzad-R / Finetune-LLAVA-NEXT

OOM error #2