Closed xhz0809 closed 7 months ago
see https://github.com/pytorch/pytorch/issues/101932 and https://github.com/huggingface/diffusers/issues/3453
It is caused by pytorch library, try to update torch version and try again. (My pytorch version is 2.1.2, which is OK) bless.
Reminder
Reproduction
sft code:
error message:
Expected behavior
Thanks for solving the previous
--shift-attn
issue. After pulling the newer version, the above error occurs. Deleting bf16 or using fp 16 works fine.System Info
CUDA version: 11.7 transformers version: 4.38.2 Platform: Linux-5.4.0-171-generic-x86_64-with-glibc2.31 Python version: 3.10.13 Huggingface_hub version: 0.21.4 PyTorch version (GPU?): 2.0.1 (True)
Others
No response