Open CaRRotOne opened 2 years ago
I could use no_sync() with DDP in pytorch to do gradient accumulation. I haven't found related inferface in bagua.
@shjwudp there is an issue related to gradient accumulation.
I could use no_sync() with DDP in pytorch to do gradient accumulation. I haven't found related inferface in bagua.