ypeleg / llama

User-friendly LLaMA: Train or Run the model using PyTorch. Nothing else.
330 stars 60 forks source link

OutOfMemoryError: CUDA out of memory #11

Open Prakash19921206 opened 1 year ago

Prakash19921206 commented 1 year ago

Hi All i am running on nvidia gtx 1080ti (11GB Video Memory) on windows 11 i get following error on running inference_example on llama-7b-hf model

OutOfMemoryError: CUDA out of memory. Tried to allocate 64.00 MiB (GPU 0; 11.00GiB total capacity; 10.29 GiB already allocated; 0 bytes free; 10.29 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

May i know how much memory is required to run this model locally Also is there any work around

Thanks