Xilinx / ACCL

Alveo Collective Communication Library: MPI-like communication operations for Xilinx Alveo accelerators
https://accl.readthedocs.io/
Apache License 2.0
81 stars 26 forks source link

Performance issue with allgather, gather and scatter in large message size #118

Closed zhenhaohe closed 1 year ago

zhenhaohe commented 2 years ago

The performance of allgather, gather and scatter is worse than sw openmpi on large message size, which was not the case shown in the paper. See the attached performance plot below.

image image image

quetric commented 2 years ago

@zhenhaohe how many ranks were used in this benchmark?

zhenhaohe commented 2 years ago

This is 8 ranks

quetric commented 2 years ago

The latency charts in the paper are mostly on 4 ranks. Could you please post 4-rank charts as well?

quetric commented 1 year ago

Closing, this was found to be due to incorrect setting of count parameter in ACCL calls