huggingface / accelerate

🚀 A simple way to launch, train, and use PyTorch models on almost any device and distributed configuration, automatic mixed precision (including fp8), and easy-to-configure FSDP and DeepSpeed support
https://huggingface.co/docs/accelerate
Apache License 2.0
7.77k stars 941 forks source link

ValueError: Attempting to unscale FP16 gradients. #2819

Closed NimbusLongfei closed 2 months ago

NimbusLongfei commented 4 months ago

System Info

Name: accelerate
Version: 0.30.1
Summary: Accelerate
Home-page: https://github.com/huggingface/accelerate
Author: The HuggingFace team
Author-email: zach.mueller@huggingface.co
License: Apache

Information

Tasks

Reproduction

When I was debugging in VSCode, the following error appeared, but strangely, this error did not occur when usingaccelerate launch. ValueError: Attempting to unscale FP16 gradients. Note that the Settings and parameters for debugging and running are exactly the same

Expected behavior

I would like to know why this error occurs.

SunMarc commented 3 months ago

Hi @NimbusLongfei, thanks for reporting ! Could you share a minimal reproducer ?

Abhrant commented 3 months ago

Hi @NimbusLongfei , is it possible that you are using BnB 4 bit quantization with FP16=True? If is thats the case, that gives this error.

github-actions[bot] commented 3 months ago

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.