Closed thirttyyy closed 1 year ago
解决了,这应该是pytorch CUDA架构的原因,参考这个 issue https://github.com/pytorch/pytorch/issues/94883,将:
train_result = trainer.train(resume_from_checkpoint=checkpoint)
改为:
with torch.cuda.amp.autocast(enabled=True, dtype=torch.bfloat16) as autocast, torch.backends.cuda.sdp_kernel(
enable_flash=False) as disable:
train_result = trainer.train(resume_from_checkpoint=checkpoint)
训练完看看训练效果受不受影响。
https://github.com/pytorch/pytorch/issues/94883 pip install --force-reinstall --pre torch --index-url https://download.pytorch.org/whl/nightly/cu117 我更换了torch版本就可以了
你好,按照readme,transformers库已经更新到最新,报错信息如下:
用的是一张A40卡,应该如何修改呢?