At the moment we use DDPStrategy to train models that don't require gradient synchronization (IncrementalPCA, OnePassMeanVarStd). However, ddp strategy requires that your model has parameters which is why we have _dummy_param in these models. This works but it get's a bit inconvenient when the trained model is used as a transform because it becomes and unused parameter and ddp complains about it (this has been fixed in #186 ).
At the moment we use
DDPStrategy
to train models that don't require gradient synchronization (IncrementalPCA
,OnePassMeanVarStd
). However, ddp strategy requires that your model has parameters which is why we have_dummy_param
in these models. This works but it get's a bit inconvenient when the trained model is used as a transform because it becomes and unused parameter and ddp complains about it (this has been fixed in #186 ).