microsoft / msccl-tools

Synthesizer for optimal collective communication algorithms
MIT License
98 stars 25 forks source link

How to generate alltoall algorithm (C, S, R)=(24, 8, 8) for DGX-1 in affordable time limit ? #52

Open oliverYoung2001 opened 1 year ago

oliverYoung2001 commented 1 year ago

I try to synthesize (C, S, R)=(24, 8, 8) algorithm for alltoall on my machine. But I cannot synthesize successfully even in one day ! And the SCCL paper shows that it can be synthesized in 133.7s My script is

TOPO="DGX1"
COLL="Alltoall"
STEPS="8"
ROUNDS=$STEPS
CHUNKS="3"

msccl solve instance ${TOPO} ${COLL} \
    --steps ${STEPS} \
    --rounds ${ROUNDS} \
    --chunks ${CHUNKS} \

And my cpu config is:

Architecture:                    x86_64
CPU op-mode(s):                  32-bit, 64-bit
Model name:                      Intel(R) Xeon(R) Gold 6252 CPU @ 2.10GHz