torch.cuda.OutOfMemoryError: CUDA out of memory.

LHRLAB / ChatKBQA

[ACL 2024] Official resources of "ChatKBQA: A Generate-then-Retrieve Framework for Knowledge Base Question Answering with Fine-tuned Large Language Models".

MIT License

225 stars 21 forks source link

Hello, my friend During the training of LLAMA2-13b on an A30 GPU equipped with 24GB of video memory, I am facing an error concerning GPU memory allocation. Are there any feasible solutions or code modifications that can resolve this issue?

error：torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 2.00 MiB (GPU 0; 23.50 GiB total capacity; 23.16 GiB already allocated; 2.81 MiB free; 23.16 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

Thanks!

LHRLAB / ChatKBQA

torch.cuda.OutOfMemoryError: CUDA out of memory. #2