Closed AntonioCarta closed 4 months ago
@lrzpellegrini I think the distributed tests are failing.
Yes, classic distributed issues. The problem is not on the code itself, but how torch shuts down the distributed module when exiting the process. I'll add a final sync + a different way to detect if a test failed.
@AlbinSou this should fix your issue.
Closes #1597