Now that the quantizer setup is being decided during create_compressed_model, and for precision init cases the resulting setup is dependent on the data loaders used for initialization, there is a possibility for DDP that each process may receive significantly different data values, and then compute a different quantizer setup each; since the entire quantizer setup is not technically a torch.Tensor, it cannot be broadcasted to all processes using PyTorch facilities.
A special tensor-only synchronization object is required so that the precision init (determining the quantizer setup) only happens in one process of the DDP group, and then the resulting quantizer setup is broadcasted to other processes in the group.
Now that the quantizer setup is being decided during
create_compressed_model
, and for precision init cases the resulting setup is dependent on the data loaders used for initialization, there is a possibility for DDP that each process may receive significantly different data values, and then compute a different quantizer setup each; since the entire quantizer setup is not technically atorch.Tensor
, it cannot be broadcasted to all processes using PyTorch facilities. A special tensor-only synchronization object is required so that the precision init (determining the quantizer setup) only happens in one process of the DDP group, and then the resulting quantizer setup is broadcasted to other processes in the group.