pytorch / torchtune

PyTorch native finetuning library
https://pytorch.org/torchtune/main/
BSD 3-Clause "New" or "Revised" License
4.39k stars 448 forks source link

clip_grad_norm=None doesn't work #1939

Closed felipemello1 closed 4 weeks ago

felipemello1 commented 4 weeks ago

When using grad_accumulation or optimizer_in_backward AND clip_grad_norm=None in the cli, it raises the error:

raise RuntimeError(
                    "Gradient clipping is not supported with optimizer in bwd."
                    "Please set clip_grad_norm=None, or optimizer_in_bwd=False."
                )

I believe that "None" is being interpreted as a string by the CLI