Hi All
i am running on nvidia gtx 1080ti (11GB Video Memory) on windows 11
i get following error on running inference_example on llama-7b-hf model
OutOfMemoryError: CUDA out of memory. Tried to allocate 64.00 MiB (GPU 0; 11.00GiB total capacity; 10.29 GiB already allocated; 0 bytes free; 10.29 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF
May i know how much memory is required to run this model locally
Also is there any work around
Hi All i am running on nvidia gtx 1080ti (11GB Video Memory) on windows 11 i get following error on running inference_example on
llama-7b-hf
modelOutOfMemoryError: CUDA out of memory. Tried to allocate 64.00 MiB (GPU 0; 11.00GiB total capacity; 10.29 GiB already allocated; 0 bytes free; 10.29 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF
May i know how much memory is required to run this model locally Also is there any work around
Thanks