cuda 12.2
transformers 4.43.3
python 3.10.12
操作系统 Ubuntu 22.04
cpu 架构x86_64(32核128线程)
内存 512G
显卡 NVIDIA GeForce RTX 4090-24G*8
Who can help? / 谁可以帮助到您?
No response
Information / 问题信息
[X] The official example scripts / 官方的示例脚本
[ ] My own modified scripts / 我自己修改的脚本和任务
Reproduction / 复现过程
CUDA_VISIBLE_DEVICES=4 python finetune.py AdvertiseGen/ /data/guanwei/LLM/glm-4-9b-chat configs/lora.yaml
进行lora微调,报错,感觉是和显卡兼容性问题
NotImplementedError: Using RTX 4000 series doesn't support faster communication broadband via P2P or IB. Please set NCCL_P2P_DISABLE="1" and NCCL_IB_DISABLE="1" or useaccelerate
launch` which will do this automatically.
System Info / 系統信息
cuda 12.2 transformers 4.43.3 python 3.10.12 操作系统 Ubuntu 22.04 cpu 架构x86_64(32核128线程) 内存 512G 显卡 NVIDIA GeForce RTX 4090-24G*8
Who can help? / 谁可以帮助到您?
No response
Information / 问题信息
Reproduction / 复现过程
CUDA_VISIBLE_DEVICES=4 python finetune.py AdvertiseGen/ /data/guanwei/LLM/glm-4-9b-chat configs/lora.yaml 进行lora微调,报错,感觉是和显卡兼容性问题 NotImplementedError: Using RTX 4000 series doesn't support faster communication broadband via P2P or IB. Please set
NCCL_P2P_DISABLE="1"
andNCCL_IB_DISABLE="1" or use
accelerate launch` which will do this automatically.Expected behavior / 期待表现
调试成功,保存checkpoint