For most reference implementations, the optimizer packaging has no special effect on the algorithm, such as Mask-RCNN + torch.optim.SGD.
Distributed training communication framework is an important part of training. Many companies are also studying the efficiency optimization of the framework. Supporting distributed training framework wrapper is conducive to promoting the technical exchange of distributed communication.
Hi, there. I want to ask whether the current rules allow the use of distributed optimizer wrappers like hvd.DistributedOptimizer ?
I see that we have a list of optimizers allowed by the competition. But maybe the distributed optimizer wrapper is a special case? My reasons as follow:
Thanks