# 指定显存占用最少的显卡
os.system('nvidia-smi -q -d Memory |grep -A4 GPU|grep Free >tmp')
memory_gpu = [int(x.split()[2]) for x in open('tmp', 'r').readlines()]
DEVICE_ID = np.argmax(memory_gpu)
torch.cuda.set_device(int(DEVICE_ID))
程序启动后,默认加载ChatGLM-6B-int4,且可以成功加载,此时显示device=3
选择ChatGLM-6B-int8 重新加载模型后,报错,此时显卡使用如下:
具体错误为:
CUDA out of memory. Tried to allocate 64.00 MiB (GPU 0; 31.75 GiB total capacity; 4.25 GiB already allocated; 44.75 MiB free; 4.25 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF
程序启动后,默认加载ChatGLM-6B-int4,且可以成功加载,此时显示device=3
选择ChatGLM-6B-int8 重新加载模型后,报错,此时显卡使用如下:
具体错误为:
CUDA out of memory. Tried to allocate 64.00 MiB (GPU 0; 31.75 GiB total capacity; 4.25 GiB already allocated; 44.75 MiB free; 4.25 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF