OutOfMemoryError: CUDA out of memory. Tried to allocate 1.91 GiB (GPU 0; 11.77 GiB total capacity; 8.59 GiB already allocated; 1.86 GiB free; 8.70 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting
max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF
消费级显卡:3060 - 12GB 启动命令:python -m llmzoo.deploy.cli --model-path FreedomIntelligence/phoenix-inst-chat-7b --max-gpu-memory 10Gib --load-8bit
在 Human 输入 prompt 回车后内存不足:
OutOfMemoryError: CUDA out of memory. Tried to allocate 1.91 GiB (GPU 0; 11.77 GiB total capacity; 8.59 GiB already allocated; 1.86 GiB free; 8.70 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF
难道 --max-gpu-memory 10Gib 没起作用?如何优化呢?求教