facebookresearch / fairseq2

FAIR Sequence Modeling Toolkit 2
https://facebookresearch.github.io/fairseq2/
MIT License
620 stars 60 forks source link

Revise DownloadManager API to handle concurrent calls #302

Open cbalioglu opened 4 months ago

cbalioglu commented 4 months ago

When multiple processes in a distributed job attempt to download the same asset, we do not handle it in a race-free way if the file system is shared. Think of a way to avoid it in FileDownloadManager

artemru commented 4 months ago

@cbalioglu : actually I'm hitting this error when launching a lots of Sonar data pipeline in parallel on Stopes. Hopefully there's a retry mechanism, but otherwise it feel like P0 is a good priority for this issue !