QwenLM / Qwen2

Qwen2 is the large language model series developed by Qwen team, Alibaba Cloud.
6.41k stars 366 forks source link

RuntimeError: at::cuda::blas::gemm: not implemented for struct c10::BFloat16 #755

Open pengmengyin opened 1 week ago

pengmengyin commented 1 week ago

有没有朋友遇到过这样的问题。model.generate()这个位置报的错

jklj077 commented 1 week ago

Hi, which kinds of GPU cards were you using? it is possible that the accelerators you used do not support bfloat16. Try setting torch_dtype = torch.float16 or torch.float32.