facebookresearch / fairseq2

FAIR Sequence Modeling Toolkit 2
https://facebookresearch.github.io/fairseq2/
MIT License
613 stars 59 forks source link

Introduce broadcast_module #581

Closed cbalioglu closed 3 weeks ago

cbalioglu commented 3 weeks ago

This PR introduces the broadcast_module() helper function and updates the wav2vec2 ASR evaluation recipe to use it. This significantly reduces the pressure on disk I/O for large models and instead broadcasts the module state over the network fabric. DDP and FSDP already offer a similar feature, but this is standalone and can be used with evaluation and inference jobs as well.

As of today, we rely on the private torch.distributed._broadcast_coalesced function although its c10d counterpart is a public API. Once the P0 items are delivered today, I will expose c10d::broadcast_coelesced in fairseq2n and remove the private API use.