FlagOpen / FlagEmbedding

Retrieval and Retrieval-augmented LLMs
MIT License
6.86k stars 495 forks source link

正常跑reranker微调,但是报无法识别显卡的错误 #546

Open sanwei111 opened 6 months ago

sanwei111 commented 6 months ago

ValueError: FP16 Mixed precision training with AMP or APEX (--fp16) and FP16 half precision evaluation (--fp16_full_eval) can only be used on CUDA or NPU devices or certain XPU devices (with IPEX).ValueError : FP16 Mixed precision training with AMP or APEX (--fp16) and FP16 half precision evaluation (--fp16_full_eval) can only be used on CUDA or NPU devices or certain XPU devices (with IPEX).

我的cuda是11.6,pip list一下发现安装的是nvidia12的包,是不是这个原因? nvidia-cublas-cu12 12.1.3.1 nvidia-cuda-cupti-cu12 12.1.105 nvidia-cuda-nvrtc-cu12 12.1.105 nvidia-cuda-runtime-cu12 12.1.105 nvidia-cudnn-cu12 8.9.2.26 nvidia-cufft-cu12 11.0.2.54 nvidia-curand-cu12 10.3.2.106 nvidia-cusolver-cu12 11.4.5.107 nvidia-cusparse-cu12 12.1.0.106 nvidia-nccl-cu12 2.19.3 nvidia-nvjitlink-cu12 12.4.99 nvidia-nvtx-cu12 12.1.105 packaging 24.0

staoxiao commented 6 months ago

应该是环境的问题,建议参考网上的方法。 另外,看起来是由于fp16引起的,可以去掉--fp16进行训练。