Open Fridge003 opened 9 months ago
Hi, any updates? I need this feature so bad.
Or is it possible to enable it on HybridParallelPlugin
in a torch-like way (described in the document)?
However, unlike GeminiPlugin
, it seems there is no enable_gradient_accumulation
for HybridParallelPlugin
.
It's confusing.
Or is it possible to enable it on
HybridParallelPlugin
in a torch-like way (described in the document)? However, unlikeGeminiPlugin
, it seems there is noenable_gradient_accumulation
forHybridParallelPlugin
. It's confusing.
Hi, we will implement this feature as soon as possible.
Or is it possible to enable it on
HybridParallelPlugin
in a torch-like way (described in the document)? However, unlikeGeminiPlugin
, it seems there is noenable_gradient_accumulation
forHybridParallelPlugin
. It's confusing. Hi, https://colossalai.org/docs/features/gradient_accumulation_with_booster, You can use the gradient accumulation of the HybridParallelPlugin in the following way.
@flybird11111 Hi, I didn't find the enable_gradient_accumulation and no_sync() in HybridParallelPlugin https://github.com/hpcaitech/ColossalAI/blob/main/colossalai/booster/plugin/hybrid_parallel_plugin.py. So I wonder how to add gradient accumulation in HybridParallelPlugin following https://colossalai.org/docs/features/gradient_accumulation_with_booster. Can you provide more details?
support gradient accumulation for hybrid parallel plugin (through implementing no_sync method for plugin)
relevant issue: #4776