Closed Mellich closed 1 year ago
Fix failing tests in #170
Some tests hang for smaller number of ranks too, and different tests hang for different backends (UDP/TCP/RDMA). At least for alltoall and barrier, the cause is that the collective, as currently implemented in firmware, requires RDMA.
Some of the unit tests hang in the dev branch for higher number of ranks (tested with 10):
Moreover, some other tests in ACCLFuncTest.* are failing.