how to do gradient accumulation in bugua

BaguaSys / bagua

Bagua Speeds up PyTorch

https://tutorials-8ro.pages.dev/

MIT License

872 stars 83 forks source link

Open CaRRotOne opened 2 years ago

CaRRotOne commented 2 years ago

I could use no_sync() with DDP in pytorch to do gradient accumulation. I haven't found related inferface in bagua.

wangraying commented 2 years ago

@shjwudp there is an issue related to gradient accumulation.