openucx / ucc

Unified Collective Communication Library
https://openucx.github.io/ucc/
BSD 3-Clause "New" or "Revised" License
177 stars 85 forks source link

TL/UCP: add reduce scatter knomial #970

Closed Sergei-Lebedev closed 1 month ago

Sergei-Lebedev commented 2 months ago

What

Add reduce scatter knomial algorithm in TL/UCP

Performance: 8 nodes 64 ppn

msgsize: knomia us. ring us.
4 29.45 271.84
8 43.29 792.63
16 53.09 950.44
32 60.85 763.14
64 123.39 883.84
128 531.5 974.35
256 1026.58 1075.77
512 1028.21 1042.87
1024 1942.62 1331.72
2048 2149.71 1287.28
4096 4126.88 2087.69
8192 8338.22 3880.91
8 nodes 1 ppn msgsize: knomial us. ring us.
4 5.47 11.59
8 5.56 14.15
16 4.69 13.96
32 4.83 13.92
64 5.34 14.36
128 5.52 15.37
256 6.16 15.88
512 7.43 19.06
1024 8.14 19.8
2048 9.48 21.71
4096 11.8 25.79
8192 18.59 30.37
16384 26.32 38.85
32768 47.18 58.44
65536 79.68 109.75
131072 130.79 171.54
janjust commented 1 month ago

@manjugv we need one of you guys to sign off review, @Sergei-Lebedev needs a review from someone else other than himself