Closed Kunlun-Zhu closed 2 years ago
Here is the direct link to the file: transformer
It is for modules that hasDistributedParameter
as their direct child. And DistributedModule
could be used as nn.Module
and being submodule of other nn.Module
.
Thanks for the answer!
So will the DistributedModel
automatically set its nn.parameters
into DistributedParameters
, is that what you indicate? In the file 'transformer.py' will the program went wrong if we change the nn.Module
into bmt.DistributedModule
, or do you mean using nn.Module
will be a safe case when we use model from model_center or when we set the parameters into DistributedParameter
manually, Thanks.
So will the DistributedModel automatically set its nn.parameters into DistributedParameters
you should turn nn.parameters
into DistributedParameters
manually even with DistributedModule
In the file 'transformer.py' will the program went wrong if we change the nn.Module into bmt.DistributedModule
This should not went wrong. bmt.DistributedModule
only take cares about the DistributedParameters
directly in it. If there is no DistributedParameters
in it, nn.Module
and bmt.DistributedModule
should work the same.
Thanks, this solves my questions.
May I ask in line 23:
Why did encoder and decoder use 'nn.module' instead of 'bmt.DistributedModule'.
May I ask in which circumstance we use 'nn.Module' instead.
Thanks