Convert gradient accumulation with Accelerate

microsoft / CodeXGLUE

CodeXGLUE

MIT License

1.56k stars 366 forks source link

Open cridin1 opened 10 months ago

cridin1 commented 10 months ago

In the train function, inside run.py:

            if args.gradient_accumulation_steps > 1:
                loss = loss / args.gradient_accumulation_steps