Azure / msccl

Microsoft Collective Communication Library
MIT License
50 stars 6 forks source link

Clarification regarding the performance reported in README #34

Closed ChenYuHo closed 2 months ago

ChenYuHo commented 5 months ago

Hi,

Could you please comment on how the performance numbers are collected in README?

Could you provide the detailed commands to reproduce the results?

e.g., are the algorithms synthesized with msccl-tools for the ND H100 v5 instances? if yes, what are the commands used?

Andyli1007 commented 2 months ago

please follow the instruction https://github.com/Azure/msccl/blob/main/docs/performance-nd-h100-v5.md and you should use scheduler, make sure the scheduler use the algo files in the ndv5 folder.