Open twichell opened 1 month ago
You might want to try rerunning it with NCCL_DEBUG_SUBSYS=INIT,ENV,TUNING
, which will tell us what algo/proto combination NCCL is choosing for every collective operation. I'm guessing it switches over to a different one at 64MB, but the new one is severely underperforming for some reason...
What does the topology look like in the file that's passed via NCCL_TOPO_FILE
? What does nvidia-smi topo -m
show?
GPU0 GPU1 GPU2 GPU3 GPU4 GPU5 GPU6 GPU7 NIC0 NIC1 NIC2 NIC3 NIC4 NIC5 NIC6 NIC7 NIC8 CPU Affinity NUMA Affinity GPU NUMA ID
GPU0 X NV18 NV18 NV18 NV18 NV18 NV18 NV18 SYS SYS SYS SYS SYS NODE NODE NODE PIX 0-79 0 N/A
GPU1 NV18 X NV18 NV18 NV18 NV18 NV18 NV18 SYS SYS SYS SYS SYS NODE NODE PIX NODE 0-79 0 N/A
GPU2 NV18 NV18 X NV18 NV18 NV18 NV18 NV18 SYS SYS SYS SYS SYS NODE PIX NODE NODE 0-79 0 N/A
GPU3 NV18 NV18 NV18 X NV18 NV18 NV18 NV18 SYS SYS SYS SYS SYS PIX NODE NODE NODE 0-79 0 N/A
GPU4 NV18 NV18 NV18 NV18 X NV18 NV18 NV18 SYS NODE NODE NODE PIX SYS SYS SYS SYS 80-159 1 N/A
GPU5 NV18 NV18 NV18 NV18 NV18 X NV18 NV18 SYS NODE NODE PIX NODE SYS SYS SYS SYS 80-159 1 N/A
GPU6 NV18 NV18 NV18 NV18 NV18 NV18 X NV18 SYS NODE PIX NODE NODE SYS SYS SYS SYS 80-159 1 N/A
GPU7 NV18 NV18 NV18 NV18 NV18 NV18 NV18 X SYS PIX NODE NODE NODE SYS SYS SYS SYS 80-159 1 N/A
NIC0 SYS SYS SYS SYS SYS SYS SYS SYS X SYS SYS SYS SYS SYS SYS SYS SYS
NIC1 SYS SYS SYS SYS NODE NODE NODE PIX SYS X NODE NODE NODE SYS SYS SYS SYS
NIC2 SYS SYS SYS SYS NODE NODE PIX NODE SYS NODE X NODE NODE SYS SYS SYS SYS
NIC3 SYS SYS SYS SYS NODE PIX NODE NODE SYS NODE NODE X NODE SYS SYS SYS SYS
NIC4 SYS SYS SYS SYS PIX NODE NODE NODE SYS NODE NODE NODE X SYS SYS SYS SYS
NIC5 NODE NODE NODE PIX SYS SYS SYS SYS SYS SYS SYS SYS SYS X NODE NODE NODE
NIC6 NODE NODE PIX NODE SYS SYS SYS SYS SYS SYS SYS SYS SYS NODE X NODE NODE
NIC7 NODE PIX NODE NODE SYS SYS SYS SYS SYS SYS SYS SYS SYS NODE NODE X NODE
NIC8 PIX NODE NODE NODE SYS SYS SYS SYS SYS SYS SYS SYS SYS NODE NODE NODE X
Legend:
X = Self
SYS = Connection traversing PCIe as well as the SMP interconnect between NUMA nodes (e.g., QPI/UPI)
NODE = Connection traversing PCIe as well as the interconnect between PCIe Host Bridges within a NUMA node
PHB = Connection traversing PCIe as well as a PCIe Host Bridge (typically the CPU)
PXB = Connection traversing multiple PCIe bridges (without traversing the PCIe Host Bridge)
PIX = Connection traversing at most a single PCIe bridge
NV# = Connection traversing a bonded set of # NVLinks
NIC Legend:
NIC0: mlx5_0
NIC1: mlx5_1
NIC2: mlx5_2
NIC3: mlx5_3
NIC4: mlx5_4
NIC5: mlx5_5
NIC6: mlx5_6
NIC7: mlx5_7
NIC8: mlx5_8
<system version="1">
<cpu host_hash="0x8753b8a01ef0a140" numaid="0" affinity="00000000,00000000,0000ffff,ffffffff,ffffffff" arch="x86_64" vendor="GenuineIntel" familyid="6" modelid="143">
<pci busid="0000:a1:00.0" class="0x060400" vendor="0x104c" device="0x8232" subsystem_vendor="0x0000" subsystem_device="0x0000" link_speed="2.5 GT/s PCIe" link_width="1">
<pci busid="0000:a3:00.0" class="0x020000" vendor="0x15b3" device="0x101e" subsystem_vendor="0x15b3" subsystem_device="0x0127" link_speed="32.0 GT/s PCIe" link_width="0">
<nic>
<net name="mlx5_8" dev="7" speed="200000" port="1" latency="0.000000" guid="0x2821dc0003e1a258" maxconn="131072" gdr="1"/>
</nic>
</pci>
<pci busid="0000:a4:00.0" class="0x030200" vendor="0x10de" device="0x2330" subsystem_vendor="0x10de" subsystem_device="0x16c1" link_speed="32.0 GT/s PCIe" link_width="0">
<gpu dev="0" sm="90" rank="0" gdr="1">
<nvlink target="0000:f5:00.0" count="5" tclass="0x068000"/>
<nvlink target="0000:f4:00.0" count="5" tclass="0x068000"/>
<nvlink target="0000:f3:00.0" count="4" tclass="0x068000"/>
<nvlink target="0000:f6:00.0" count="4" tclass="0x068000"/>
</gpu>
</pci>
</pci>
<pci busid="0000:ab:00.0" class="0x060400" vendor="0x104c" device="0x8232" subsystem_vendor="0x0000" subsystem_device="0x0000" link_speed="2.5 GT/s PCIe" link_width="1">
<pci busid="0000:ad:00.0" class="0x020000" vendor="0x15b3" device="0x101e" subsystem_vendor="0x15b3" subsystem_device="0x0127" link_speed="32.0 GT/s PCIe" link_width="0">
<nic>
<net name="mlx5_7" dev="6" speed="200000" port="1" latency="0.000000" guid="0xe823dc0003e1a258" maxconn="131072" gdr="1"/>
</nic>
</pci>
<pci busid="0000:ae:00.0" class="0x030200" vendor="0x10de" device="0x2330" subsystem_vendor="0x10de" subsystem_device="0x16c1" link_speed="32.0 GT/s PCIe" link_width="0">
<gpu dev="1" sm="90" rank="1" gdr="1">
<nvlink target="0000:f6:00.0" count="4" tclass="0x068000"/>
<nvlink target="0000:f4:00.0" count="5" tclass="0x068000"/>
<nvlink target="0000:f5:00.0" count="5" tclass="0x068000"/>
<nvlink target="0000:f3:00.0" count="4" tclass="0x068000"/>
</gpu>
</pci>
</pci>
<pci busid="0000:b5:00.0" class="0x060400" vendor="0x104c" device="0x8232" subsystem_vendor="0x0000" subsystem_device="0x0000" link_speed="2.5 GT/s PCIe" link_width="1">
<pci busid="0000:b7:00.0" class="0x020000" vendor="0x15b3" device="0x101e" subsystem_vendor="0x15b3" subsystem_device="0x0127" link_speed="32.0 GT/s PCIe" link_width="0">
<nic>
<net name="mlx5_6" dev="5" speed="200000" port="1" latency="0.000000" guid="0x822dc0003e1a258" maxconn="131072" gdr="1"/>
</nic>
</pci>
<pci busid="0000:b8:00.0" class="0x030200" vendor="0x10de" device="0x2330" subsystem_vendor="0x10de" subsystem_device="0x16c1" link_speed="32.0 GT/s PCIe" link_width="0">
<gpu dev="2" sm="90" rank="2" gdr="1">
<nvlink target="0000:f4:00.0" count="5" tclass="0x068000"/>
<nvlink target="0000:f3:00.0" count="4" tclass="0x068000"/>
<nvlink target="0000:f6:00.0" count="4" tclass="0x068000"/>
<nvlink target="0000:f5:00.0" count="5" tclass="0x068000"/>
</gpu>
</pci>
</pci>
<pci busid="0000:bf:00.0" class="0x060400" vendor="0x104c" device="0x8232" subsystem_vendor="0x0000" subsystem_device="0x0000" link_speed="2.5 GT/s PCIe" link_width="1">
<pci busid="0000:c1:00.0" class="0x020000" vendor="0x15b3" device="0x101e" subsystem_vendor="0x15b3" subsystem_device="0x0127" link_speed="32.0 GT/s PCIe" link_width="0">
<nic>
<net name="mlx5_5" dev="4" speed="200000" port="1" latency="0.000000" guid="0x2828dc0003e1a258" maxconn="131072" gdr="1"/>
</nic>
</pci>
<pci busid="0000:c2:00.0" class="0x030200" vendor="0x10de" device="0x2330" subsystem_vendor="0x10de" subsystem_device="0x16c1" link_speed="32.0 GT/s PCIe" link_width="0">
<gpu dev="3" sm="90" rank="3" gdr="1">
<nvlink target="0000:f6:00.0" count="4" tclass="0x068000"/>
<nvlink target="0000:f4:00.0" count="5" tclass="0x068000"/>
<nvlink target="0000:f3:00.0" count="4" tclass="0x068000"/>
<nvlink target="0000:f5:00.0" count="5" tclass="0x068000"/>
</gpu>
</pci>
</pci>
</cpu>
<cpu host_hash="0x8753b8a01ef0a140" numaid="1" affinity="ffffffff,ffffffff,ffff0000,00000000,00000000" arch="x86_64" vendor="GenuineIntel" familyid="6" modelid="143">
<pci busid="0000:c9:00.0" class="0x060400" vendor="0x104c" device="0x8232" subsystem_vendor="0x0000" subsystem_device="0x0000" link_speed="2.5 GT/s PCIe" link_width="1">
<pci busid="0000:cb:00.0" class="0x020000" vendor="0x15b3" device="0x101e" subsystem_vendor="0x15b3" subsystem_device="0x0127" link_speed="32.0 GT/s PCIe" link_width="0">
<nic>
<net name="mlx5_4" dev="3" speed="200000" port="1" latency="0.000000" guid="0xb811dc0003e1a258" maxconn="131072" gdr="1"/>
</nic>
</pci>
<pci busid="0000:cc:00.0" class="0x030200" vendor="0x10de" device="0x2330" subsystem_vendor="0x10de" subsystem_device="0x16c1" link_speed="32.0 GT/s PCIe" link_width="0">
<gpu dev="4" sm="90" rank="4" gdr="1">
<nvlink target="0000:f5:00.0" count="5" tclass="0x068000"/>
<nvlink target="0000:f6:00.0" count="4" tclass="0x068000"/>
<nvlink target="0000:f3:00.0" count="4" tclass="0x068000"/>
<nvlink target="0000:f4:00.0" count="5" tclass="0x068000"/>
</gpu>
</pci>
</pci>
<pci busid="0000:d3:00.0" class="0x060400" vendor="0x104c" device="0x8232" subsystem_vendor="0x0000" subsystem_device="0x0000" link_speed="2.5 GT/s PCIe" link_width="1">
<pci busid="0000:d5:00.0" class="0x020000" vendor="0x15b3" device="0x101e" subsystem_vendor="0x15b3" subsystem_device="0x0127" link_speed="32.0 GT/s PCIe" link_width="0">
<nic>
<net name="mlx5_3" dev="2" speed="200000" port="1" latency="0.000000" guid="0xc808dc0003e1a258" maxconn="131072" gdr="1"/>
</nic>
</pci>
<pci busid="0000:d6:00.0" class="0x030200" vendor="0x10de" device="0x2330" subsystem_vendor="0x10de" subsystem_device="0x16c1" link_speed="32.0 GT/s PCIe" link_width="0">
<gpu dev="5" sm="90" rank="5" gdr="1">
<nvlink target="0000:f4:00.0" count="5" tclass="0x068000"/>
<nvlink target="0000:f6:00.0" count="4" tclass="0x068000"/>
<nvlink target="0000:f5:00.0" count="5" tclass="0x068000"/>
<nvlink target="0000:f3:00.0" count="4" tclass="0x068000"/>
</gpu>
</pci>
</pci>
<pci busid="0000:dd:00.0" class="0x060400" vendor="0x104c" device="0x8232" subsystem_vendor="0x0000" subsystem_device="0x0000" link_speed="2.5 GT/s PCIe" link_width="1">
<pci busid="0000:df:00.0" class="0x020000" vendor="0x15b3" device="0x101e" subsystem_vendor="0x15b3" subsystem_device="0x0127" link_speed="32.0 GT/s PCIe" link_width="0">
<nic>
<net name="mlx5_2" dev="1" speed="200000" port="1" latency="0.000000" guid="0xa80bdc0003e1a258" maxconn="131072" gdr="1"/>
</nic>
</pci>
<pci busid="0000:e0:00.0" class="0x030200" vendor="0x10de" device="0x2330" subsystem_vendor="0x10de" subsystem_device="0x16c1" link_speed="32.0 GT/s PCIe" link_width="0">
<gpu dev="6" sm="90" rank="6" gdr="1">
<nvlink target="0000:f4:00.0" count="5" tclass="0x068000"/>
<nvlink target="0000:f5:00.0" count="5" tclass="0x068000"/>
<nvlink target="0000:f6:00.0" count="4" tclass="0x068000"/>
<nvlink target="0000:f3:00.0" count="4" tclass="0x068000"/>
</gpu>
</pci>
</pci>
<pci busid="0000:e7:00.0" class="0x060400" vendor="0x104c" device="0x8232" subsystem_vendor="0x0000" subsystem_device="0x0000" link_speed="2.5 GT/s PCIe" link_width="1">
<pci busid="0000:e9:00.0" class="0x020000" vendor="0x15b3" device="0x101e" subsystem_vendor="0x15b3" subsystem_device="0x0127" link_speed="32.0 GT/s PCIe" link_width="0">
<nic>
<net name="mlx5_1" dev="0" speed="200000" port="1" latency="0.000000" guid="0x480cdc0003e1a258" maxconn="131072" gdr="1"/>
</nic>
</pci>
<pci busid="0000:ea:00.0" class="0x030200" vendor="0x10de" device="0x2330" subsystem_vendor="0x10de" subsystem_device="0x16c1" link_speed="32.0 GT/s PCIe" link_width="0">
<gpu dev="7" sm="90" rank="7" gdr="1">
<nvlink target="0000:f4:00.0" count="5" tclass="0x068000"/>
<nvlink target="0000:f6:00.0" count="4" tclass="0x068000"/>
<nvlink target="0000:f5:00.0" count="5" tclass="0x068000"/>
<nvlink target="0000:f3:00.0" count="4" tclass="0x068000"/>
</gpu>
</pci>
</pci>
</cpu>
</system>
[0] # nThread 1 nGpus 1 minBytes 1 maxBytes 8589934592 step: 2(factor) warmup iters: 5 iters: 20 agg iters: 1 validation: 1 graph: 0
[0] #
[0] # Using devices
[0] # Rank 0 Group 0 Pid 497888 on h100clust-worker-1 device 0 [0xa4] NVIDIA H100 80GB HBM3
[0] # Rank 1 Group 0 Pid 497889 on h100clust-worker-1 device 1 [0xae] NVIDIA H100 80GB HBM3
[0] # Rank 2 Group 0 Pid 497890 on h100clust-worker-1 device 2 [0xb8] NVIDIA H100 80GB HBM3
[0] # Rank 3 Group 0 Pid 497891 on h100clust-worker-1 device 3 [0xc2] NVIDIA H100 80GB HBM3
[0] # Rank 4 Group 0 Pid 497892 on h100clust-worker-1 device 4 [0xcc] NVIDIA H100 80GB HBM3
[0] # Rank 5 Group 0 Pid 497893 on h100clust-worker-1 device 5 [0xd6] NVIDIA H100 80GB HBM3
[0] # Rank 6 Group 0 Pid 497894 on h100clust-worker-1 device 6 [0xe0] NVIDIA H100 80GB HBM3
[0] # Rank 7 Group 0 Pid 497895 on h100clust-worker-1 device 7 [0xea] NVIDIA H100 80GB HBM3
[0] # Rank 8 Group 0 Pid 496763 on h100clust-worker-32 device 0 [0xa4] NVIDIA H100 80GB HBM3
[0] # Rank 9 Group 0 Pid 496764 on h100clust-worker-32 device 1 [0xae] NVIDIA H100 80GB HBM3
[0] # Rank 10 Group 0 Pid 496765 on h100clust-worker-32 device 2 [0xb8] NVIDIA H100 80GB HBM3
[0] # Rank 11 Group 0 Pid 496766 on h100clust-worker-32 device 3 [0xc2] NVIDIA H100 80GB HBM3
[0] # Rank 12 Group 0 Pid 496767 on h100clust-worker-32 device 4 [0xcc] NVIDIA H100 80GB HBM3
[0] # Rank 13 Group 0 Pid 496768 on h100clust-worker-32 device 5 [0xd6] NVIDIA H100 80GB HBM3
[0] # Rank 14 Group 0 Pid 496769 on h100clust-worker-32 device 6 [0xe0] NVIDIA H100 80GB HBM3
[0] # Rank 15 Group 0 Pid 496770 on h100clust-worker-32 device 7 [0xea] NVIDIA H100 80GB HBM3
[0] # Rank 16 Group 0 Pid 497321 on h100clust-worker-5 device 0 [0xa4] NVIDIA H100 80GB HBM3
[0] # Rank 17 Group 0 Pid 497322 on h100clust-worker-5 device 1 [0xae] NVIDIA H100 80GB HBM3
[0] # Rank 18 Group 0 Pid 497323 on h100clust-worker-5 device 2 [0xb8] NVIDIA H100 80GB HBM3
[0] # Rank 19 Group 0 Pid 497324 on h100clust-worker-5 device 3 [0xc2] NVIDIA H100 80GB HBM3
[0] # Rank 20 Group 0 Pid 497325 on h100clust-worker-5 device 4 [0xcc] NVIDIA H100 80GB HBM3
[0] # Rank 21 Group 0 Pid 497326 on h100clust-worker-5 device 5 [0xd6] NVIDIA H100 80GB HBM3
[0] # Rank 22 Group 0 Pid 497327 on h100clust-worker-5 device 6 [0xe0] NVIDIA H100 80GB HBM3
[0] # Rank 23 Group 0 Pid 497328 on h100clust-worker-5 device 7 [0xea] NVIDIA H100 80GB HBM3
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO Bootstrap : Using enp0s3:10.241.128.7<0>
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO cudaDriverVersion 12040
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO NCCL version 2.22.3+cuda12.5
[0] h100clust-worker-1:497888:497958 [0] NCCL INFO Plugin Path : /usr/local/lib/libnccl-net.so
[0] h100clust-worker-1:497888:497958 [0] NCCL INFO P2P plugin v8 IBext_v8
[0] h100clust-worker-1:497888:497958 [0] NCCL INFO NCCL_IB_ADAPTIVE_ROUTING set by environment to 1.
[0] h100clust-worker-1:497888:497958 [0] NCCL INFO NCCL_IB_PCI_RELAXED_ORDERING set by environment to 2.
[0] h100clust-worker-1:497888:497958 [0] NCCL INFO NET/IB : Using [0]mlx5_1:1/RoCE [1]mlx5_2:1/RoCE [2]mlx5_3:1/RoCE [3]mlx5_4:1/RoCE [4]mlx5_5:1/RoCE [5]mlx5_6:1/RoCE [6]mlx5_7:1/RoCE [7]mlx5_8:1/RoCE [RO]; OOB enp0s3:10.241.128.7<0>
[0] h100clust-worker-1:497888:497958 [0] NCCL INFO Using network IBext_v8
[0] h100clust-worker-1:497888:497958 [0] NCCL INFO NCCL_CHECK_POINTERS set by environment to 0.
[0] h100clust-worker-1:497888:497958 [0] NCCL INFO DMA-BUF is available on GPU device 0
[0] h100clust-worker-1:497888:497958 [0] NCCL INFO ncclCommInitRank comm 0x561097745b80 rank 0 nranks 24 cudaDev 0 nvmlDev 0 busId a4000 commId 0x1d39176440db4936 - Init START
[0] h100clust-worker-1:497888:497958 [0] NCCL INFO MNNVL busId 0xa4000 fabric UUID 0.0 cliqueId 0x0 state 3 healthMask 0x0
[0] h100clust-worker-1:497888:497958 [0] NCCL INFO NCCL_TOPO_FILE set by environment to /home/greg/output/mn-h100-vela2.xml.pristine.xml
[0] h100clust-worker-1:497888:497958 [0] NCCL INFO NCCL_TOPO_DUMP_FILE set by environment to /home/greg/output/mn-h100-vela2.xml
[0]
[0] h100clust-worker-1:497888:497958 [0] graph/xml.cc:267 NCCL WARN Unable to open /home/greg/output/mn-h100-vela2.xml, not dumping topology.
[0] h100clust-worker-1:497888:497958 [0] NCCL INFO Setting affinity for GPU 0 to ffff,ffffffff,ffffffff
[0] h100clust-worker-1:497888:497958 [0] NCCL INFO NVLS multicast support is available on dev 0
[0] h100clust-worker-1:497888:497958 [0] NCCL INFO NCCL_CROSS_NIC set by environment to 2.
[0] h100clust-worker-1:497888:497958 [0] NCCL INFO comm 0x561097745b80 rank 0 nRanks 24 nNodes 3 localRanks 8 localRank 0 MNNVL 0
[0] h100clust-worker-1:497888:497958 [0] NCCL INFO NVLS Head 0: 0 8 16
[0] h100clust-worker-1:497888:497958 [0] NCCL INFO NVLS Head 1: 1 9 17
[0] h100clust-worker-1:497888:497958 [0] NCCL INFO NVLS Head 2: 2 10 18
[0] h100clust-worker-1:497888:497958 [0] NCCL INFO NVLS Head 3: 3 11 19
[0] h100clust-worker-1:497888:497958 [0] NCCL INFO NVLS Head 4: 4 12 20
[0] h100clust-worker-1:497888:497958 [0] NCCL INFO NVLS Head 5: 5 13 21
[0] h100clust-worker-1:497888:497958 [0] NCCL INFO NVLS Head 6: 6 14 22
[0] h100clust-worker-1:497888:497958 [0] NCCL INFO NVLS Head 7: 7 15 23
[0] h100clust-worker-1:497888:497958 [0] NCCL INFO Channel 00/16 : 0 7 6 5 4 3 2 1 8 15 14 13 12 11 10 9 16 23 22 21
[0] h100clust-worker-1:497888:497958 [0] NCCL INFO Channel 01/16 : 0 7 6 5 4 3 2 9 8 15 14 13 12 11 10 17 16 23 22 21
[0] h100clust-worker-1:497888:497958 [0] NCCL INFO Channel 02/16 : 0 7 6 5 4 3 10 9 8 15 14 13 12 11 18 17 16 23 22 21
[0] h100clust-worker-1:497888:497958 [0] NCCL INFO Channel 03/16 : 0 7 6 5 4 11 10 9 8 15 14 13 12 19 18 17 16 23 22 21
[0] h100clust-worker-1:497888:497958 [0] NCCL INFO Channel 04/16 : 0 7 6 5 12 11 10 9 8 15 14 13 20 19 18 17 16 23 22 21
[0] h100clust-worker-1:497888:497958 [0] NCCL INFO Channel 05/16 : 0 7 6 13 12 11 10 9 8 15 14 21 20 19 18 17 16 23 22 5
[0] h100clust-worker-1:497888:497958 [0] NCCL INFO Channel 06/16 : 0 7 14 13 12 11 10 9 8 15 22 21 20 19 18 17 16 23 6 5
[0] h100clust-worker-1:497888:497958 [0] NCCL INFO Channel 07/16 : 0 15 14 13 12 11 10 9 8 23 22 21 20 19 18 17 16 7 6 5
[0] h100clust-worker-1:497888:497958 [0] NCCL INFO Channel 08/16 : 0 7 6 5 4 3 2 1 8 15 14 13 12 11 10 9 16 23 22 21
[0] h100clust-worker-1:497888:497958 [0] NCCL INFO Channel 09/16 : 0 7 6 5 4 3 2 9 8 15 14 13 12 11 10 17 16 23 22 21
[0] h100clust-worker-1:497888:497958 [0] NCCL INFO Channel 10/16 : 0 7 6 5 4 3 10 9 8 15 14 13 12 11 18 17 16 23 22 21
[0] h100clust-worker-1:497888:497958 [0] NCCL INFO Channel 11/16 : 0 7 6 5 4 11 10 9 8 15 14 13 12 19 18 17 16 23 22 21
[0] h100clust-worker-1:497888:497958 [0] NCCL INFO Channel 12/16 : 0 7 6 5 12 11 10 9 8 15 14 13 20 19 18 17 16 23 22 21
[0] h100clust-worker-1:497888:497958 [0] NCCL INFO Channel 13/16 : 0 7 6 13 12 11 10 9 8 15 14 21 20 19 18 17 16 23 22 5
[0] h100clust-worker-1:497888:497958 [0] NCCL INFO Channel 14/16 : 0 7 14 13 12 11 10 9 8 15 22 21 20 19 18 17 16 23 6 5
[0] h100clust-worker-1:497888:497958 [0] NCCL INFO Channel 15/16 : 0 15 14 13 12 11 10 9 8 23 22 21 20 19 18 17 16 7 6 5
[0] h100clust-worker-1:497888:497958 [0] NCCL INFO Trees [0] 1/16/-1->0->-1 [1] -1/-1/-1->0->7 [2] 1/-1/-1->0->7 [3] 1/-1/-1->0->7 [4] 1/-1/-1->0->7 [5] 1/-1/-1->0->7 [6] 1/-1/-1->0->7 [7] 1/-1/-1->0->7 [8] 1/16/-1->0->8 [9] -1/-1/-1->0->7 [10] 1/-1/-1->0->7 [11] 1/-1/-1->0->7 [12] 1/-1/-1->0->7 [13] 1/-1/-1->0->7 [14] 1/-1/-1->0->7 [15] 1/-1/-1->0->7
[0] h100clust-worker-1:497888:497958 [0] NCCL INFO NCCL_BUFFSIZE set by environment to 67108864.
[0] h100clust-worker-1:497888:497958 [0] NCCL INFO P2P Chunksize set to 131072
[0] h100clust-worker-1:497888:497958 [0] NCCL INFO Algorithm | Tree | Ring | CollNetDirect |
[0] h100clust-worker-1:497888:497958 [0] NCCL INFO Protocol | LL | LL128 | Simple | LL | LL128 | Simple | LL | LL128 | Simple |
[0] h100clust-worker-1:497888:497958 [0] NCCL INFO Max NThreads | 512 | 640 | 512 | 512 | 640 | 512 | 0 | 0 | 640 |
[0] h100clust-worker-1:497888:497958 [0] NCCL INFO Broadcast | 0.0/ 0.0 | 0.0/ 0.0 | 0.0/ 0.0 | 71.4/ 20.4 | 110.0/ 176.6 | 680.4/ 192.0 | 0.0/ 0.0 | 0.0/ 0.0 | 0.0/ 0.0 |
[0] h100clust-worker-1:497888:497958 [0] NCCL INFO Reduce | 0.0/ 0.0 | 0.0/ 0.0 | 0.0/ 0.0 | 71.4/ 20.4 | 110.0/ 176.6 | 680.4/ 192.0 | 0.0/ 0.0 | 0.0/ 0.0 | 0.0/ 0.0 |
[0] h100clust-worker-1:497888:497958 [0] NCCL INFO AllGather | 0.0/ 0.0 | 0.0/ 0.0 | 0.0/ 0.0 | 33.0/ 21.3 | 61.9/ 184.3 | 107.8/ 200.3 | 5.6/ 0.0 | 5.6/ 0.0 | 44.0/ 0.0 |
[0] h100clust-worker-1:497888:497958 [0] NCCL INFO ReduceScatter | 0.0/ 0.0 | 0.0/ 0.0 | 0.0/ 0.0 | 33.0/ 21.3 | 61.9/ 184.3 | 107.8/ 200.3 | 5.6/ 0.0 | 5.6/ 0.0 | 44.0/ 0.0 |
[0] h100clust-worker-1:497888:497958 [0] NCCL INFO AllReduce | 25.2/ 8.7 | 48.5/ 70.4 | 448.0/ 75.1 | 62.8/ 10.6 | 114.0/ 92.2 | 228.4/ 100.2 | 5.6/ 0.0 | 5.6/ 0.0 | 44.0/ 0.0 |
[0] h100clust-worker-1:497888:497958 [0] NCCL INFO Algorithm | CollNetChain | NVLS | NVLSTree |
[0] h100clust-worker-1:497888:497958 [0] NCCL INFO Protocol | LL | LL128 | Simple | LL | LL128 | Simple | LL | LL128 | Simple |
[0] h100clust-worker-1:497888:497958 [0] NCCL INFO Max NThreads | 0 | 0 | 640 | 0 | 0 | 640 | 0 | 0 | 640 |
[0] h100clust-worker-1:497888:497958 [0] NCCL INFO Broadcast | 0.0/ 0.0 | 0.0/ 0.0 | 0.0/ 0.0 | 0.0/ 0.0 | 0.0/ 0.0 | 0.0/ 0.0 | 0.0/ 0.0 | 0.0/ 0.0 | 0.0/ 0.0 |
[0] h100clust-worker-1:497888:497958 [0] NCCL INFO Reduce | 0.0/ 0.0 | 0.0/ 0.0 | 0.0/ 0.0 | 0.0/ 0.0 | 0.0/ 0.0 | 0.0/ 0.0 | 0.0/ 0.0 | 0.0/ 0.0 | 0.0/ 0.0 |
[0] h100clust-worker-1:497888:497958 [0] NCCL INFO AllGather | 0.0/ 0.0 | 0.0/ 0.0 | 0.0/ 0.0 | 0.0/ 0.0 | 0.0/ 0.0 | 43.0/ 0.0 | 0.0/ 0.0 | 0.0/ 0.0 | 0.0/ 0.0 |
[0] h100clust-worker-1:497888:497958 [0] NCCL INFO ReduceScatter | 0.0/ 0.0 | 0.0/ 0.0 | 0.0/ 0.0 | 0.0/ 0.0 | 0.0/ 0.0 | 43.0/ 0.0 | 0.0/ 0.0 | 0.0/ 0.0 | 0.0/ 0.0 |
[0] h100clust-worker-1:497888:497958 [0] NCCL INFO AllReduce | 0.0/ 0.0 | 0.0/ 0.0 | 69.2/ 0.0 | 0.0/ 0.0 | 0.0/ 0.0 | 43.0/ 0.0 | 0.0/ 0.0 | 0.0/ 0.0 | 53.0/ 80.0 |
[0] h100clust-worker-1:497888:497958 [0] NCCL INFO threadThresholds 8/8/64 | 192/8/64 | 512 | 512
[0] h100clust-worker-1:497888:497958 [0] NCCL INFO 16 coll channels, 16 collnet channels, 16 nvls channels, 16 p2p channels, 2 p2p channels per peer
[0] h100clust-worker-1:497888:497958 [0] NCCL INFO CC Off, Multi-GPU CC Off, workFifoBytes 1048576
[0] h100clust-worker-1:497888:497958 [0] NCCL INFO TUNER/Plugin: Failed to find ncclTunerPlugin_v3 symbol.
[0] h100clust-worker-1:497888:497958 [0] NCCL INFO TUNER/Plugin: Failed to find ncclTunerPlugin_v2 symbol, using internal tuner instead.
[0] h100clust-worker-1:497888:497958 [0] NCCL INFO ncclCommInitRank comm 0x561097745b80 rank 0 nranks 24 cudaDev 0 nvmlDev 0 busId a4000 commId 0x1d39176440db4936 - Init COMPLETE
[0] h100clust-worker-1:497888:497958 [0] NCCL INFO Init timings: rank 0 nranks 24 total 3.45 (kernels 0.33, bootstrap 2.75, allgathers 0.10, topo 0.03, graphs 0.12, connections 0.12, rest 0.00)
[0] #
[0] # out-of-place in-place
[0] # size count type redop root time algbw busbw #wrong time algbw busbw #wrong
[0] # (B) (elements) (us) (GB/s) (GB/s) (us) (GB/s) (GB/s)
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 8589934592 Bytes -> Algo 1 proto 2 time 86069.968750
[0] h100clust-worker-1:497888:498059 [0] NCCL INFO Channel 00/0 : 17[1] -> 0[0] [receive] via NET/IBext_v8/7/GDRDMA
[0] h100clust-worker-1:497888:498059 [0] NCCL INFO Channel 08/0 : 17[1] -> 0[0] [receive] via NET/IBext_v8/7/GDRDMA
[0] h100clust-worker-1:497888:498059 [0] NCCL INFO Channel 00/0 : 0[0] -> 7[7] via P2P/CUMEM
[0] h100clust-worker-1:497888:498059 [0] NCCL INFO Channel 01/0 : 0[0] -> 7[7] via P2P/CUMEM
[0] h100clust-worker-1:497888:498059 [0] NCCL INFO Channel 02/0 : 0[0] -> 7[7] via P2P/CUMEM
[0] h100clust-worker-1:497888:498059 [0] NCCL INFO Channel 03/0 : 0[0] -> 7[7] via P2P/CUMEM
[0] h100clust-worker-1:497888:498059 [0] NCCL INFO Channel 04/0 : 0[0] -> 7[7] via P2P/CUMEM
[0] h100clust-worker-1:497888:498059 [0] NCCL INFO Channel 05/0 : 0[0] -> 7[7] via P2P/CUMEM
[0] h100clust-worker-1:497888:498059 [0] NCCL INFO Channel 06/0 : 0[0] -> 7[7] via P2P/CUMEM
[0] h100clust-worker-1:497888:498059 [0] NCCL INFO Channel 08/0 : 0[0] -> 7[7] via P2P/CUMEM
[0] h100clust-worker-1:497888:498059 [0] NCCL INFO Channel 09/0 : 0[0] -> 7[7] via P2P/CUMEM
[0] h100clust-worker-1:497888:498059 [0] NCCL INFO Channel 10/0 : 0[0] -> 7[7] via P2P/CUMEM
[0] h100clust-worker-1:497888:498059 [0] NCCL INFO Channel 11/0 : 0[0] -> 7[7] via P2P/CUMEM
[0] h100clust-worker-1:497888:498059 [0] NCCL INFO Channel 12/0 : 0[0] -> 7[7] via P2P/CUMEM
[0] h100clust-worker-1:497888:498059 [0] NCCL INFO Channel 13/0 : 0[0] -> 7[7] via P2P/CUMEM
[0] h100clust-worker-1:497888:498059 [0] NCCL INFO Channel 14/0 : 0[0] -> 7[7] via P2P/CUMEM
[0] h100clust-worker-1:497888:498059 [0] NCCL INFO Channel 07/0 : 0[0] -> 15[7] [send] via NET/IBext_v8/0(7)/GDRDMA
[0] h100clust-worker-1:497888:498059 [0] NCCL INFO Channel 15/0 : 0[0] -> 15[7] [send] via NET/IBext_v8/0(7)/GDRDMA
[0] h100clust-worker-1:497888:498043 [0] NCCL INFO NCCL_IB_QPS_PER_CONNECTION set by environment to 2.
[0] h100clust-worker-1:497888:498043 [0] NCCL INFO NCCL_IB_GID_INDEX set by environment to 3.
[0] h100clust-worker-1:497888:498043 [0] NCCL INFO NCCL_IB_TIMEOUT set by environment to 22.
[0] h100clust-worker-1:497888:498043 [0] NCCL INFO NCCL_IB_RETRY_CNT set by environment to 10.
[0] h100clust-worker-1:497888:498059 [0] NCCL INFO Connected all rings
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 8589934592 Bytes -> Algo 1 proto 2 time 86069.968750
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 8589934592 Bytes -> Algo 1 proto 2 time 86069.968750
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 8589934592 Bytes -> Algo 1 proto 2 time 86069.968750
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 8589934592 Bytes -> Algo 1 proto 2 time 86069.968750
[0] 0 0 float sum -1[0] 0.42 0.00 0.00 0[0] 0.19 0.00 0.00 0
[0] 0 0 float sum -1[0] 0.18 0.00 0.00 0[0] 0.19 0.00 0.00 0
[0] 4 1 float sum -1[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 4 Bytes -> Algo 0 proto 0 time 25.200462
[0] h100clust-worker-1:497888:498087 [0] NCCL INFO Channel 00/0 : 0[0] -> 1[1] via P2P/CUMEM
[0] h100clust-worker-1:497888:498087 [0] NCCL INFO Channel 02/0 : 0[0] -> 1[1] via P2P/CUMEM
[0] h100clust-worker-1:497888:498087 [0] NCCL INFO Channel 03/0 : 0[0] -> 1[1] via P2P/CUMEM
[0] h100clust-worker-1:497888:498087 [0] NCCL INFO Channel 04/0 : 0[0] -> 1[1] via P2P/CUMEM
[0] h100clust-worker-1:497888:498087 [0] NCCL INFO Channel 05/0 : 0[0] -> 1[1] via P2P/CUMEM
[0] h100clust-worker-1:497888:498087 [0] NCCL INFO Channel 06/0 : 0[0] -> 1[1] via P2P/CUMEM
[0] h100clust-worker-1:497888:498087 [0] NCCL INFO Channel 07/0 : 0[0] -> 1[1] via P2P/CUMEM
[0] h100clust-worker-1:497888:498087 [0] NCCL INFO Channel 08/0 : 0[0] -> 1[1] via P2P/CUMEM
[0] h100clust-worker-1:497888:498087 [0] NCCL INFO Channel 10/0 : 0[0] -> 1[1] via P2P/CUMEM
[0] h100clust-worker-1:497888:498087 [0] NCCL INFO Channel 11/0 : 0[0] -> 1[1] via P2P/CUMEM
[0] h100clust-worker-1:497888:498087 [0] NCCL INFO Channel 12/0 : 0[0] -> 1[1] via P2P/CUMEM
[0] h100clust-worker-1:497888:498087 [0] NCCL INFO Channel 13/0 : 0[0] -> 1[1] via P2P/CUMEM
[0] h100clust-worker-1:497888:498087 [0] NCCL INFO Channel 14/0 : 0[0] -> 1[1] via P2P/CUMEM
[0] h100clust-worker-1:497888:498087 [0] NCCL INFO Channel 15/0 : 0[0] -> 1[1] via P2P/CUMEM
[0] h100clust-worker-1:497888:498087 [0] NCCL INFO Channel 07/0 : 0[0] -> 7[7] via P2P/CUMEM
[0] h100clust-worker-1:497888:498087 [0] NCCL INFO Channel 15/0 : 0[0] -> 7[7] via P2P/CUMEM
[0] h100clust-worker-1:497888:498087 [0] NCCL INFO Channel 00/0 : 16[0] -> 0[0] [receive] via NET/IBext_v8/7/GDRDMA
[0] h100clust-worker-1:497888:498087 [0] NCCL INFO Channel 08/0 : 16[0] -> 0[0] [receive] via NET/IBext_v8/7/GDRDMA
[0] h100clust-worker-1:497888:498087 [0] NCCL INFO Channel 08/0 : 0[0] -> 8[0] [send] via NET/IBext_v8/7/GDRDMA
[0] h100clust-worker-1:497888:498087 [0] NCCL INFO Channel 08/0 : 8[0] -> 0[0] [receive] via NET/IBext_v8/7/GDRDMA
[0] h100clust-worker-1:497888:498087 [0] NCCL INFO Channel 00/0 : 0[0] -> 16[0] [send] via NET/IBext_v8/7/GDRDMA
[0] h100clust-worker-1:497888:498087 [0] NCCL INFO Channel 08/0 : 0[0] -> 16[0] [send] via NET/IBext_v8/7/GDRDMA
[0] h100clust-worker-1:497888:498087 [0] NCCL INFO Connected all trees
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 4 Bytes -> Algo 0 proto 0 time 25.200462
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 4 Bytes -> Algo 0 proto 0 time 25.200462
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 4 Bytes -> Algo 0 proto 0 time 25.200462
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 4 Bytes -> Algo 0 proto 0 time 25.200462
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 4 Bytes -> Algo 0 proto 0 time 25.200462
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 4 Bytes -> Algo 0 proto 0 time 25.200462
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 4 Bytes -> Algo 0 proto 0 time 25.200462
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 4 Bytes -> Algo 0 proto 0 time 25.200462
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 4 Bytes -> Algo 0 proto 0 time 25.200462
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 4 Bytes -> Algo 0 proto 0 time 25.200462
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 4 Bytes -> Algo 0 proto 0 time 25.200462
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 4 Bytes -> Algo 0 proto 0 time 25.200462
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 4 Bytes -> Algo 0 proto 0 time 25.200462
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 4 Bytes -> Algo 0 proto 0 time 25.200462
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 4 Bytes -> Algo 0 proto 0 time 25.200462
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 4 Bytes -> Algo 0 proto 0 time 25.200462
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 4 Bytes -> Algo 0 proto 0 time 25.200462
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 4 Bytes -> Algo 0 proto 0 time 25.200462
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 4 Bytes -> Algo 0 proto 0 time 25.200462
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 4 Bytes -> Algo 0 proto 0 time 25.200462
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 4 Bytes -> Algo 0 proto 0 time 25.200462
[0] 267.6 0.00 0.00 0[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 4 Bytes -> Algo 0 proto 0 time 25.200462
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 4 Bytes -> Algo 0 proto 0 time 25.200462
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 4 Bytes -> Algo 0 proto 0 time 25.200462
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 4 Bytes -> Algo 0 proto 0 time 25.200462
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 4 Bytes -> Algo 0 proto 0 time 25.200462
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 4 Bytes -> Algo 0 proto 0 time 25.200462
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 4 Bytes -> Algo 0 proto 0 time 25.200462
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 4 Bytes -> Algo 0 proto 0 time 25.200462
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 4 Bytes -> Algo 0 proto 0 time 25.200462
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 4 Bytes -> Algo 0 proto 0 time 25.200462
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 4 Bytes -> Algo 0 proto 0 time 25.200462
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 4 Bytes -> Algo 0 proto 0 time 25.200462
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 4 Bytes -> Algo 0 proto 0 time 25.200462
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 4 Bytes -> Algo 0 proto 0 time 25.200462
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 4 Bytes -> Algo 0 proto 0 time 25.200462
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 4 Bytes -> Algo 0 proto 0 time 25.200462
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 4 Bytes -> Algo 0 proto 0 time 25.200462
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 4 Bytes -> Algo 0 proto 0 time 25.200462
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 4 Bytes -> Algo 0 proto 0 time 25.200462
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 4 Bytes -> Algo 0 proto 0 time 25.200462
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 4 Bytes -> Algo 0 proto 0 time 25.200462
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 4 Bytes -> Algo 0 proto 0 time 25.200462
[0] 69.59 0.00 0.00 0
[0] 8 2 float sum -1[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 8 Bytes -> Algo 0 proto 0 time 25.200924
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 8 Bytes -> Algo 0 proto 0 time 25.200924
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 8 Bytes -> Algo 0 proto 0 time 25.200924
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 8 Bytes -> Algo 0 proto 0 time 25.200924
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 8 Bytes -> Algo 0 proto 0 time 25.200924
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 8 Bytes -> Algo 0 proto 0 time 25.200924
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 8 Bytes -> Algo 0 proto 0 time 25.200924
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 8 Bytes -> Algo 0 proto 0 time 25.200924
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 8 Bytes -> Algo 0 proto 0 time 25.200924
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 8 Bytes -> Algo 0 proto 0 time 25.200924
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 8 Bytes -> Algo 0 proto 0 time 25.200924
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 8 Bytes -> Algo 0 proto 0 time 25.200924
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 8 Bytes -> Algo 0 proto 0 time 25.200924
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 8 Bytes -> Algo 0 proto 0 time 25.200924
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 8 Bytes -> Algo 0 proto 0 time 25.200924
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 8 Bytes -> Algo 0 proto 0 time 25.200924
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 8 Bytes -> Algo 0 proto 0 time 25.200924
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 8 Bytes -> Algo 0 proto 0 time 25.200924
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 8 Bytes -> Algo 0 proto 0 time 25.200924
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 8 Bytes -> Algo 0 proto 0 time 25.200924
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 8 Bytes -> Algo 0 proto 0 time 25.200924
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 8 Bytes -> Algo 0 proto 0 time 25.200924
[0] 64.18 0.00 0.00 0[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 8 Bytes -> Algo 0 proto 0 time 25.200924
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 8 Bytes -> Algo 0 proto 0 time 25.200924
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 8 Bytes -> Algo 0 proto 0 time 25.200924
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 8 Bytes -> Algo 0 proto 0 time 25.200924
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 8 Bytes -> Algo 0 proto 0 time 25.200924
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 8 Bytes -> Algo 0 proto 0 time 25.200924
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 8 Bytes -> Algo 0 proto 0 time 25.200924
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 8 Bytes -> Algo 0 proto 0 time 25.200924
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 8 Bytes -> Algo 0 proto 0 time 25.200924
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 8 Bytes -> Algo 0 proto 0 time 25.200924
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 8 Bytes -> Algo 0 proto 0 time 25.200924
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 8 Bytes -> Algo 0 proto 0 time 25.200924
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 8 Bytes -> Algo 0 proto 0 time 25.200924
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 8 Bytes -> Algo 0 proto 0 time 25.200924
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 8 Bytes -> Algo 0 proto 0 time 25.200924
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 8 Bytes -> Algo 0 proto 0 time 25.200924
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 8 Bytes -> Algo 0 proto 0 time 25.200924
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 8 Bytes -> Algo 0 proto 0 time 25.200924
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 8 Bytes -> Algo 0 proto 0 time 25.200924
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 8 Bytes -> Algo 0 proto 0 time 25.200924
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 8 Bytes -> Algo 0 proto 0 time 25.200924
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 8 Bytes -> Algo 0 proto 0 time 25.200924
[0] 59.38 0.00 0.00 0
[0] 16 4 float sum -1[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 16 Bytes -> Algo 0 proto 0 time 25.201847
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 16 Bytes -> Algo 0 proto 0 time 25.201847
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 16 Bytes -> Algo 0 proto 0 time 25.201847
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 16 Bytes -> Algo 0 proto 0 time 25.201847
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 16 Bytes -> Algo 0 proto 0 time 25.201847
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 16 Bytes -> Algo 0 proto 0 time 25.201847
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 16 Bytes -> Algo 0 proto 0 time 25.201847
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 16 Bytes -> Algo 0 proto 0 time 25.201847
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 16 Bytes -> Algo 0 proto 0 time 25.201847
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 16 Bytes -> Algo 0 proto 0 time 25.201847
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 16 Bytes -> Algo 0 proto 0 time 25.201847
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 16 Bytes -> Algo 0 proto 0 time 25.201847
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 16 Bytes -> Algo 0 proto 0 time 25.201847
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 16 Bytes -> Algo 0 proto 0 time 25.201847
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 16 Bytes -> Algo 0 proto 0 time 25.201847
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 16 Bytes -> Algo 0 proto 0 time 25.201847
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 16 Bytes -> Algo 0 proto 0 time 25.201847
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 16 Bytes -> Algo 0 proto 0 time 25.201847
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 16 Bytes -> Algo 0 proto 0 time 25.201847
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 16 Bytes -> Algo 0 proto 0 time 25.201847
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 16 Bytes -> Algo 0 proto 0 time 25.201847
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 16 Bytes -> Algo 0 proto 0 time 25.201847
[0] 60.01 0.00 0.00 0[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 16 Bytes -> Algo 0 proto 0 time 25.201847
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 16 Bytes -> Algo 0 proto 0 time 25.201847
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 16 Bytes -> Algo 0 proto 0 time 25.201847
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 16 Bytes -> Algo 0 proto 0 time 25.201847
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 16 Bytes -> Algo 0 proto 0 time 25.201847
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 16 Bytes -> Algo 0 proto 0 time 25.201847
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 16 Bytes -> Algo 0 proto 0 time 25.201847
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 16 Bytes -> Algo 0 proto 0 time 25.201847
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 16 Bytes -> Algo 0 proto 0 time 25.201847
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 16 Bytes -> Algo 0 proto 0 time 25.201847
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 16 Bytes -> Algo 0 proto 0 time 25.201847
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 16 Bytes -> Algo 0 proto 0 time 25.201847
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 16 Bytes -> Algo 0 proto 0 time 25.201847
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 16 Bytes -> Algo 0 proto 0 time 25.201847
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 16 Bytes -> Algo 0 proto 0 time 25.201847
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 16 Bytes -> Algo 0 proto 0 time 25.201847
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 16 Bytes -> Algo 0 proto 0 time 25.201847
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 16 Bytes -> Algo 0 proto 0 time 25.201847
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 16 Bytes -> Algo 0 proto 0 time 25.201847
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 16 Bytes -> Algo 0 proto 0 time 25.201847
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 16 Bytes -> Algo 0 proto 0 time 25.201847
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 16 Bytes -> Algo 0 proto 0 time 25.201847
[0] 60.02 0.00 0.00 0
[0] 32 8 float sum -1[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 32 Bytes -> Algo 0 proto 0 time 25.203691
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 32 Bytes -> Algo 0 proto 0 time 25.203691
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 32 Bytes -> Algo 0 proto 0 time 25.203691
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 32 Bytes -> Algo 0 proto 0 time 25.203691
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 32 Bytes -> Algo 0 proto 0 time 25.203691
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 32 Bytes -> Algo 0 proto 0 time 25.203691
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 32 Bytes -> Algo 0 proto 0 time 25.203691
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 32 Bytes -> Algo 0 proto 0 time 25.203691
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 32 Bytes -> Algo 0 proto 0 time 25.203691
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 32 Bytes -> Algo 0 proto 0 time 25.203691
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 32 Bytes -> Algo 0 proto 0 time 25.203691
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 32 Bytes -> Algo 0 proto 0 time 25.203691
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 32 Bytes -> Algo 0 proto 0 time 25.203691
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 32 Bytes -> Algo 0 proto 0 time 25.203691
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 32 Bytes -> Algo 0 proto 0 time 25.203691
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 32 Bytes -> Algo 0 proto 0 time 25.203691
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 32 Bytes -> Algo 0 proto 0 time 25.203691
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 32 Bytes -> Algo 0 proto 0 time 25.203691
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 32 Bytes -> Algo 0 proto 0 time 25.203691
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 32 Bytes -> Algo 0 proto 0 time 25.203691
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 32 Bytes -> Algo 0 proto 0 time 25.203691
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 32 Bytes -> Algo 0 proto 0 time 25.203691
[0] 59.84 0.00 0.00 0[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 32 Bytes -> Algo 0 proto 0 time 25.203691
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 32 Bytes -> Algo 0 proto 0 time 25.203691
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 32 Bytes -> Algo 0 proto 0 time 25.203691
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 32 Bytes -> Algo 0 proto 0 time 25.203691
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 32 Bytes -> Algo 0 proto 0 time 25.203691
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 32 Bytes -> Algo 0 proto 0 time 25.203691
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 32 Bytes -> Algo 0 proto 0 time 25.203691
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 32 Bytes -> Algo 0 proto 0 time 25.203691
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 32 Bytes -> Algo 0 proto 0 time 25.203691
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 32 Bytes -> Algo 0 proto 0 time 25.203691
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 32 Bytes -> Algo 0 proto 0 time 25.203691
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 32 Bytes -> Algo 0 proto 0 time 25.203691
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 32 Bytes -> Algo 0 proto 0 time 25.203691
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 32 Bytes -> Algo 0 proto 0 time 25.203691
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 32 Bytes -> Algo 0 proto 0 time 25.203691
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 32 Bytes -> Algo 0 proto 0 time 25.203691
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 32 Bytes -> Algo 0 proto 0 time 25.203691
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 32 Bytes -> Algo 0 proto 0 time 25.203691
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 32 Bytes -> Algo 0 proto 0 time 25.203691
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 32 Bytes -> Algo 0 proto 0 time 25.203691
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 32 Bytes -> Algo 0 proto 0 time 25.203691
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 32 Bytes -> Algo 0 proto 0 time 25.203691
[0] 59.86 0.00 0.00 0
[0] 64 16 float sum -1[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 64 Bytes -> Algo 0 proto 0 time 25.207382
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 64 Bytes -> Algo 0 proto 0 time 25.207382
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 64 Bytes -> Algo 0 proto 0 time 25.207382
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 64 Bytes -> Algo 0 proto 0 time 25.207382
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 64 Bytes -> Algo 0 proto 0 time 25.207382
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 64 Bytes -> Algo 0 proto 0 time 25.207382
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 64 Bytes -> Algo 0 proto 0 time 25.207382
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 64 Bytes -> Algo 0 proto 0 time 25.207382
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 64 Bytes -> Algo 0 proto 0 time 25.207382
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 64 Bytes -> Algo 0 proto 0 time 25.207382
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 64 Bytes -> Algo 0 proto 0 time 25.207382
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 64 Bytes -> Algo 0 proto 0 time 25.207382
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 64 Bytes -> Algo 0 proto 0 time 25.207382
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 64 Bytes -> Algo 0 proto 0 time 25.207382
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 64 Bytes -> Algo 0 proto 0 time 25.207382
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 64 Bytes -> Algo 0 proto 0 time 25.207382
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 64 Bytes -> Algo 0 proto 0 time 25.207382
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 64 Bytes -> Algo 0 proto 0 time 25.207382
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 64 Bytes -> Algo 0 proto 0 time 25.207382
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 64 Bytes -> Algo 0 proto 0 time 25.207382
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 64 Bytes -> Algo 0 proto 0 time 25.207382
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 64 Bytes -> Algo 0 proto 0 time 25.207382
[0] 60.43 0.00 0.00 0[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 64 Bytes -> Algo 0 proto 0 time 25.207382
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 64 Bytes -> Algo 0 proto 0 time 25.207382
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 64 Bytes -> Algo 0 proto 0 time 25.207382
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 64 Bytes -> Algo 0 proto 0 time 25.207382
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 64 Bytes -> Algo 0 proto 0 time 25.207382
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 64 Bytes -> Algo 0 proto 0 time 25.207382
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 64 Bytes -> Algo 0 proto 0 time 25.207382
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 64 Bytes -> Algo 0 proto 0 time 25.207382
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 64 Bytes -> Algo 0 proto 0 time 25.207382
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 64 Bytes -> Algo 0 proto 0 time 25.207382
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 64 Bytes -> Algo 0 proto 0 time 25.207382
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 64 Bytes -> Algo 0 proto 0 time 25.207382
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 64 Bytes -> Algo 0 proto 0 time 25.207382
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 64 Bytes -> Algo 0 proto 0 time 25.207382
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 64 Bytes -> Algo 0 proto 0 time 25.207382
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 64 Bytes -> Algo 0 proto 0 time 25.207382
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 64 Bytes -> Algo 0 proto 0 time 25.207382
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 64 Bytes -> Algo 0 proto 0 time 25.207382
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 64 Bytes -> Algo 0 proto 0 time 25.207382
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 64 Bytes -> Algo 0 proto 0 time 25.207382
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 64 Bytes -> Algo 0 proto 0 time 25.207382
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 64 Bytes -> Algo 0 proto 0 time 25.207382
[0] 60.06 0.00 0.00 0
[0] 128 32 float sum -1[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 128 Bytes -> Algo 0 proto 0 time 25.214764
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 128 Bytes -> Algo 0 proto 0 time 25.214764
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 128 Bytes -> Algo 0 proto 0 time 25.214764
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 128 Bytes -> Algo 0 proto 0 time 25.214764
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 128 Bytes -> Algo 0 proto 0 time 25.214764
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 128 Bytes -> Algo 0 proto 0 time 25.214764
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 128 Bytes -> Algo 0 proto 0 time 25.214764
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 128 Bytes -> Algo 0 proto 0 time 25.214764
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 128 Bytes -> Algo 0 proto 0 time 25.214764
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 128 Bytes -> Algo 0 proto 0 time 25.214764
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 128 Bytes -> Algo 0 proto 0 time 25.214764
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 128 Bytes -> Algo 0 proto 0 time 25.214764
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 128 Bytes -> Algo 0 proto 0 time 25.214764
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 128 Bytes -> Algo 0 proto 0 time 25.214764
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 128 Bytes -> Algo 0 proto 0 time 25.214764
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 128 Bytes -> Algo 0 proto 0 time 25.214764
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 128 Bytes -> Algo 0 proto 0 time 25.214764
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 128 Bytes -> Algo 0 proto 0 time 25.214764
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 128 Bytes -> Algo 0 proto 0 time 25.214764
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 128 Bytes -> Algo 0 proto 0 time 25.214764
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 128 Bytes -> Algo 0 proto 0 time 25.214764
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 128 Bytes -> Algo 0 proto 0 time 25.214764
[0] 61.58 0.00 0.00 0[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 128 Bytes -> Algo 0 proto 0 time 25.214764
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 128 Bytes -> Algo 0 proto 0 time 25.214764
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 128 Bytes -> Algo 0 proto 0 time 25.214764
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 128 Bytes -> Algo 0 proto 0 time 25.214764
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 128 Bytes -> Algo 0 proto 0 time 25.214764
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 128 Bytes -> Algo 0 proto 0 time 25.214764
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 128 Bytes -> Algo 0 proto 0 time 25.214764
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 128 Bytes -> Algo 0 proto 0 time 25.214764
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 128 Bytes -> Algo 0 proto 0 time 25.214764
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 128 Bytes -> Algo 0 proto 0 time 25.214764
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 128 Bytes -> Algo 0 proto 0 time 25.214764
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 128 Bytes -> Algo 0 proto 0 time 25.214764
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 128 Bytes -> Algo 0 proto 0 time 25.214764
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 128 Bytes -> Algo 0 proto 0 time 25.214764
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 128 Bytes -> Algo 0 proto 0 time 25.214764
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 128 Bytes -> Algo 0 proto 0 time 25.214764
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 128 Bytes -> Algo 0 proto 0 time 25.214764
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 128 Bytes -> Algo 0 proto 0 time 25.214764
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 128 Bytes -> Algo 0 proto 0 time 25.214764
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 128 Bytes -> Algo 0 proto 0 time 25.214764
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 128 Bytes -> Algo 0 proto 0 time 25.214764
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 128 Bytes -> Algo 0 proto 0 time 25.214764
[0] 60.77 0.00 0.00 0
[0] 256 64 float sum -1[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 256 Bytes -> Algo 0 proto 0 time 25.229528
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 256 Bytes -> Algo 0 proto 0 time 25.229528
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 256 Bytes -> Algo 0 proto 0 time 25.229528
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 256 Bytes -> Algo 0 proto 0 time 25.229528
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 256 Bytes -> Algo 0 proto 0 time 25.229528
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 256 Bytes -> Algo 0 proto 0 time 25.229528
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 256 Bytes -> Algo 0 proto 0 time 25.229528
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 256 Bytes -> Algo 0 proto 0 time 25.229528
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 256 Bytes -> Algo 0 proto 0 time 25.229528
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 256 Bytes -> Algo 0 proto 0 time 25.229528
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 256 Bytes -> Algo 0 proto 0 time 25.229528
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 256 Bytes -> Algo 0 proto 0 time 25.229528
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 256 Bytes -> Algo 0 proto 0 time 25.229528
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 256 Bytes -> Algo 0 proto 0 time 25.229528
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 256 Bytes -> Algo 0 proto 0 time 25.229528
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 256 Bytes -> Algo 0 proto 0 time 25.229528
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 256 Bytes -> Algo 0 proto 0 time 25.229528
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 256 Bytes -> Algo 0 proto 0 time 25.229528
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 256 Bytes -> Algo 0 proto 0 time 25.229528
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 256 Bytes -> Algo 0 proto 0 time 25.229528
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 256 Bytes -> Algo 0 proto 0 time 25.229528
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 256 Bytes -> Algo 0 proto 0 time 25.229528
[0] 72.81 0.00 0.01 0[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 256 Bytes -> Algo 0 proto 0 time 25.229528
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 256 Bytes -> Algo 0 proto 0 time 25.229528
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 256 Bytes -> Algo 0 proto 0 time 25.229528
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 256 Bytes -> Algo 0 proto 0 time 25.229528
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 256 Bytes -> Algo 0 proto 0 time 25.229528
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 256 Bytes -> Algo 0 proto 0 time 25.229528
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 256 Bytes -> Algo 0 proto 0 time 25.229528
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 256 Bytes -> Algo 0 proto 0 time 25.229528
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 256 Bytes -> Algo 0 proto 0 time 25.229528
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 256 Bytes -> Algo 0 proto 0 time 25.229528
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 256 Bytes -> Algo 0 proto 0 time 25.229528
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 256 Bytes -> Algo 0 proto 0 time 25.229528
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 256 Bytes -> Algo 0 proto 0 time 25.229528
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 256 Bytes -> Algo 0 proto 0 time 25.229528
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 256 Bytes -> Algo 0 proto 0 time 25.229528
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 256 Bytes -> Algo 0 proto 0 time 25.229528
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 256 Bytes -> Algo 0 proto 0 time 25.229528
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 256 Bytes -> Algo 0 proto 0 time 25.229528
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 256 Bytes -> Algo 0 proto 0 time 25.229528
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 256 Bytes -> Algo 0 proto 0 time 25.229528
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 256 Bytes -> Algo 0 proto 0 time 25.229528
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 256 Bytes -> Algo 0 proto 0 time 25.229528
[0] 62.05 0.00 0.01 0
[0] 512 128 float sum -1[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 512 Bytes -> Algo 0 proto 0 time 25.259054
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 512 Bytes -> Algo 0 proto 0 time 25.259054
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 512 Bytes -> Algo 0 proto 0 time 25.259054
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 512 Bytes -> Algo 0 proto 0 time 25.259054
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 512 Bytes -> Algo 0 proto 0 time 25.259054
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 512 Bytes -> Algo 0 proto 0 time 25.259054
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 512 Bytes -> Algo 0 proto 0 time 25.259054
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 512 Bytes -> Algo 0 proto 0 time 25.259054
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 512 Bytes -> Algo 0 proto 0 time 25.259054
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 512 Bytes -> Algo 0 proto 0 time 25.259054
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 512 Bytes -> Algo 0 proto 0 time 25.259054
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 512 Bytes -> Algo 0 proto 0 time 25.259054
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 512 Bytes -> Algo 0 proto 0 time 25.259054
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 512 Bytes -> Algo 0 proto 0 time 25.259054
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 512 Bytes -> Algo 0 proto 0 time 25.259054
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 512 Bytes -> Algo 0 proto 0 time 25.259054
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 512 Bytes -> Algo 0 proto 0 time 25.259054
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 512 Bytes -> Algo 0 proto 0 time 25.259054
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 512 Bytes -> Algo 0 proto 0 time 25.259054
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 512 Bytes -> Algo 0 proto 0 time 25.259054
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 512 Bytes -> Algo 0 proto 0 time 25.259054
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 512 Bytes -> Algo 0 proto 0 time 25.259054
[0] 92.22 0.01 0.01 0[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 512 Bytes -> Algo 0 proto 0 time 25.259054
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 512 Bytes -> Algo 0 proto 0 time 25.259054
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 512 Bytes -> Algo 0 proto 0 time 25.259054
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 512 Bytes -> Algo 0 proto 0 time 25.259054
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 512 Bytes -> Algo 0 proto 0 time 25.259054
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 512 Bytes -> Algo 0 proto 0 time 25.259054
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 512 Bytes -> Algo 0 proto 0 time 25.259054
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 512 Bytes -> Algo 0 proto 0 time 25.259054
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 512 Bytes -> Algo 0 proto 0 time 25.259054
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 512 Bytes -> Algo 0 proto 0 time 25.259054
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 512 Bytes -> Algo 0 proto 0 time 25.259054
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 512 Bytes -> Algo 0 proto 0 time 25.259054
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 512 Bytes -> Algo 0 proto 0 time 25.259054
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 512 Bytes -> Algo 0 proto 0 time 25.259054
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 512 Bytes -> Algo 0 proto 0 time 25.259054
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 512 Bytes -> Algo 0 proto 0 time 25.259054
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 512 Bytes -> Algo 0 proto 0 time 25.259054
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 512 Bytes -> Algo 0 proto 0 time 25.259054
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 512 Bytes -> Algo 0 proto 0 time 25.259054
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 512 Bytes -> Algo 0 proto 0 time 25.259054
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 512 Bytes -> Algo 0 proto 0 time 25.259054
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 512 Bytes -> Algo 0 proto 0 time 25.259054
[0] 63.54 0.01 0.02 0
[0] 1024 256 float sum -1[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 1024 Bytes -> Algo 0 proto 0 time 25.331232
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 1024 Bytes -> Algo 0 proto 0 time 25.331232
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 1024 Bytes -> Algo 0 proto 0 time 25.331232
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 1024 Bytes -> Algo 0 proto 0 time 25.331232
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 1024 Bytes -> Algo 0 proto 0 time 25.331232
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 1024 Bytes -> Algo 0 proto 0 time 25.331232
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 1024 Bytes -> Algo 0 proto 0 time 25.331232
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 1024 Bytes -> Algo 0 proto 0 time 25.331232
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 1024 Bytes -> Algo 0 proto 0 time 25.331232
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 1024 Bytes -> Algo 0 proto 0 time 25.331232
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 1024 Bytes -> Algo 0 proto 0 time 25.331232
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 1024 Bytes -> Algo 0 proto 0 time 25.331232
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 1024 Bytes -> Algo 0 proto 0 time 25.331232
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 1024 Bytes -> Algo 0 proto 0 time 25.331232
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 1024 Bytes -> Algo 0 proto 0 time 25.331232
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 1024 Bytes -> Algo 0 proto 0 time 25.331232
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 1024 Bytes -> Algo 0 proto 0 time 25.331232
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 1024 Bytes -> Algo 0 proto 0 time 25.331232
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 1024 Bytes -> Algo 0 proto 0 time 25.331232
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 1024 Bytes -> Algo 0 proto 0 time 25.331232
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 1024 Bytes -> Algo 0 proto 0 time 25.331232
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 1024 Bytes -> Algo 0 proto 0 time 25.331232
[0] 66.24 0.02 0.03 0[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 1024 Bytes -> Algo 0 proto 0 time 25.331232
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 1024 Bytes -> Algo 0 proto 0 time 25.331232
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 1024 Bytes -> Algo 0 proto 0 time 25.331232
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 1024 Bytes -> Algo 0 proto 0 time 25.331232
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 1024 Bytes -> Algo 0 proto 0 time 25.331232
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 1024 Bytes -> Algo 0 proto 0 time 25.331232
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 1024 Bytes -> Algo 0 proto 0 time 25.331232
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 1024 Bytes -> Algo 0 proto 0 time 25.331232
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 1024 Bytes -> Algo 0 proto 0 time 25.331232
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 1024 Bytes -> Algo 0 proto 0 time 25.331232
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 1024 Bytes -> Algo 0 proto 0 time 25.331232
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 1024 Bytes -> Algo 0 proto 0 time 25.331232
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 1024 Bytes -> Algo 0 proto 0 time 25.331232
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 1024 Bytes -> Algo 0 proto 0 time 25.331232
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 1024 Bytes -> Algo 0 proto 0 time 25.331232
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 1024 Bytes -> Algo 0 proto 0 time 25.331232
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 1024 Bytes -> Algo 0 proto 0 time 25.331232
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 1024 Bytes -> Algo 0 proto 0 time 25.331232
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 1024 Bytes -> Algo 0 proto 0 time 25.331232
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 1024 Bytes -> Algo 0 proto 0 time 25.331232
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 1024 Bytes -> Algo 0 proto 0 time 25.331232
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 1024 Bytes -> Algo 0 proto 0 time 25.331232
[0] 66.59 0.02 0.03 0
[0] 2048 512 float sum -1[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 2048 Bytes -> Algo 0 proto 0 time 25.495272
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 2048 Bytes -> Algo 0 proto 0 time 25.495272
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 2048 Bytes -> Algo 0 proto 0 time 25.495272
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 2048 Bytes -> Algo 0 proto 0 time 25.495272
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 2048 Bytes -> Algo 0 proto 0 time 25.495272
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 2048 Bytes -> Algo 0 proto 0 time 25.495272
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 2048 Bytes -> Algo 0 proto 0 time 25.495272
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 2048 Bytes -> Algo 0 proto 0 time 25.495272
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 2048 Bytes -> Algo 0 proto 0 time 25.495272
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 2048 Bytes -> Algo 0 proto 0 time 25.495272
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 2048 Bytes -> Algo 0 proto 0 time 25.495272
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 2048 Bytes -> Algo 0 proto 0 time 25.495272
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 2048 Bytes -> Algo 0 proto 0 time 25.495272
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 2048 Bytes -> Algo 0 proto 0 time 25.495272
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 2048 Bytes -> Algo 0 proto 0 time 25.495272
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 2048 Bytes -> Algo 0 proto 0 time 25.495272
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 2048 Bytes -> Algo 0 proto 0 time 25.495272
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 2048 Bytes -> Algo 0 proto 0 time 25.495272
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 2048 Bytes -> Algo 0 proto 0 time 25.495272
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 2048 Bytes -> Algo 0 proto 0 time 25.495272
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 2048 Bytes -> Algo 0 proto 0 time 25.495272
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 2048 Bytes -> Algo 0 proto 0 time 25.495272
[0] 73.09 0.03 0.05 0[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 2048 Bytes -> Algo 0 proto 0 time 25.495272
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 2048 Bytes -> Algo 0 proto 0 time 25.495272
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 2048 Bytes -> Algo 0 proto 0 time 25.495272
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 2048 Bytes -> Algo 0 proto 0 time 25.495272
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 2048 Bytes -> Algo 0 proto 0 time 25.495272
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 2048 Bytes -> Algo 0 proto 0 time 25.495272
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 2048 Bytes -> Algo 0 proto 0 time 25.495272
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 2048 Bytes -> Algo 0 proto 0 time 25.495272
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 2048 Bytes -> Algo 0 proto 0 time 25.495272
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 2048 Bytes -> Algo 0 proto 0 time 25.495272
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 2048 Bytes -> Algo 0 proto 0 time 25.495272
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 2048 Bytes -> Algo 0 proto 0 time 25.495272
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 2048 Bytes -> Algo 0 proto 0 time 25.495272
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 2048 Bytes -> Algo 0 proto 0 time 25.495272
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 2048 Bytes -> Algo 0 proto 0 time 25.495272
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 2048 Bytes -> Algo 0 proto 0 time 25.495272
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 2048 Bytes -> Algo 0 proto 0 time 25.495272
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 2048 Bytes -> Algo 0 proto 0 time 25.495272
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 2048 Bytes -> Algo 0 proto 0 time 25.495272
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 2048 Bytes -> Algo 0 proto 0 time 25.495272
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 2048 Bytes -> Algo 0 proto 0 time 25.495272
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 2048 Bytes -> Algo 0 proto 0 time 25.495272
[0] 72.49 0.03 0.05 0
[0] 4096 1024 float sum -1[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 4096 Bytes -> Algo 0 proto 0 time 25.874907
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 4096 Bytes -> Algo 0 proto 0 time 25.874907
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 4096 Bytes -> Algo 0 proto 0 time 25.874907
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 4096 Bytes -> Algo 0 proto 0 time 25.874907
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 4096 Bytes -> Algo 0 proto 0 time 25.874907
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 4096 Bytes -> Algo 0 proto 0 time 25.874907
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 4096 Bytes -> Algo 0 proto 0 time 25.874907
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 4096 Bytes -> Algo 0 proto 0 time 25.874907
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 4096 Bytes -> Algo 0 proto 0 time 25.874907
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 4096 Bytes -> Algo 0 proto 0 time 25.874907
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 4096 Bytes -> Algo 0 proto 0 time 25.874907
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 4096 Bytes -> Algo 0 proto 0 time 25.874907
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 4096 Bytes -> Algo 0 proto 0 time 25.874907
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 4096 Bytes -> Algo 0 proto 0 time 25.874907
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 4096 Bytes -> Algo 0 proto 0 time 25.874907
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 4096 Bytes -> Algo 0 proto 0 time 25.874907
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 4096 Bytes -> Algo 0 proto 0 time 25.874907
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 4096 Bytes -> Algo 0 proto 0 time 25.874907
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 4096 Bytes -> Algo 0 proto 0 time 25.874907
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 4096 Bytes -> Algo 0 proto 0 time 25.874907
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 4096 Bytes -> Algo 0 proto 0 time 25.874907
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 4096 Bytes -> Algo 0 proto 0 time 25.874907
[0] 75.82 0.05 0.10 0[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 4096 Bytes -> Algo 0 proto 0 time 25.874907
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 4096 Bytes -> Algo 0 proto 0 time 25.874907
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 4096 Bytes -> Algo 0 proto 0 time 25.874907
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 4096 Bytes -> Algo 0 proto 0 time 25.874907
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 4096 Bytes -> Algo 0 proto 0 time 25.874907
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 4096 Bytes -> Algo 0 proto 0 time 25.874907
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 4096 Bytes -> Algo 0 proto 0 time 25.874907
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 4096 Bytes -> Algo 0 proto 0 time 25.874907
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 4096 Bytes -> Algo 0 proto 0 time 25.874907
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 4096 Bytes -> Algo 0 proto 0 time 25.874907
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 4096 Bytes -> Algo 0 proto 0 time 25.874907
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 4096 Bytes -> Algo 0 proto 0 time 25.874907
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 4096 Bytes -> Algo 0 proto 0 time 25.874907
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 4096 Bytes -> Algo 0 proto 0 time 25.874907
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 4096 Bytes -> Algo 0 proto 0 time 25.874907
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 4096 Bytes -> Algo 0 proto 0 time 25.874907
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 4096 Bytes -> Algo 0 proto 0 time 25.874907
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 4096 Bytes -> Algo 0 proto 0 time 25.874907
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 4096 Bytes -> Algo 0 proto 0 time 25.874907
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 4096 Bytes -> Algo 0 proto 0 time 25.874907
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 4096 Bytes -> Algo 0 proto 0 time 25.874907
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 4096 Bytes -> Algo 0 proto 0 time 25.874907
[0] 75.38 0.05 0.10 0
[0] 8192 2048 float sum -1[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 8192 Bytes -> Algo 0 proto 0 time 26.549810
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 8192 Bytes -> Algo 0 proto 0 time 26.549810
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 8192 Bytes -> Algo 0 proto 0 time 26.549810
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 8192 Bytes -> Algo 0 proto 0 time 26.549810
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 8192 Bytes -> Algo 0 proto 0 time 26.549810
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 8192 Bytes -> Algo 0 proto 0 time 26.549810
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 8192 Bytes -> Algo 0 proto 0 time 26.549810
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 8192 Bytes -> Algo 0 proto 0 time 26.549810
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 8192 Bytes -> Algo 0 proto 0 time 26.549810
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 8192 Bytes -> Algo 0 proto 0 time 26.549810
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 8192 Bytes -> Algo 0 proto 0 time 26.549810
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 8192 Bytes -> Algo 0 proto 0 time 26.549810
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 8192 Bytes -> Algo 0 proto 0 time 26.549810
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 8192 Bytes -> Algo 0 proto 0 time 26.549810
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 8192 Bytes -> Algo 0 proto 0 time 26.549810
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 8192 Bytes -> Algo 0 proto 0 time 26.549810
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 8192 Bytes -> Algo 0 proto 0 time 26.549810
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 8192 Bytes -> Algo 0 proto 0 time 26.549810
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 8192 Bytes -> Algo 0 proto 0 time 26.549810
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 8192 Bytes -> Algo 0 proto 0 time 26.549810
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 8192 Bytes -> Algo 0 proto 0 time 26.549810
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 8192 Bytes -> Algo 0 proto 0 time 26.549810
[0] 153.3 0.05 0.10 0[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 8192 Bytes -> Algo 0 proto 0 time 26.549810
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 8192 Bytes -> Algo 0 proto 0 time 26.549810
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 8192 Bytes -> Algo 0 proto 0 time 26.549810
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 8192 Bytes -> Algo 0 proto 0 time 26.549810
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 8192 Bytes -> Algo 0 proto 0 time 26.549810
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 8192 Bytes -> Algo 0 proto 0 time 26.549810
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 8192 Bytes -> Algo 0 proto 0 time 26.549810
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 8192 Bytes -> Algo 0 proto 0 time 26.549810
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 8192 Bytes -> Algo 0 proto 0 time 26.549810
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 8192 Bytes -> Algo 0 proto 0 time 26.549810
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 8192 Bytes -> Algo 0 proto 0 time 26.549810
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 8192 Bytes -> Algo 0 proto 0 time 26.549810
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 8192 Bytes -> Algo 0 proto 0 time 26.549810
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 8192 Bytes -> Algo 0 proto 0 time 26.549810
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 8192 Bytes -> Algo 0 proto 0 time 26.549810
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 8192 Bytes -> Algo 0 proto 0 time 26.549810
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 8192 Bytes -> Algo 0 proto 0 time 26.549810
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 8192 Bytes -> Algo 0 proto 0 time 26.549810
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 8192 Bytes -> Algo 0 proto 0 time 26.549810
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 8192 Bytes -> Algo 0 proto 0 time 26.549810
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 8192 Bytes -> Algo 0 proto 0 time 26.549810
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 8192 Bytes -> Algo 0 proto 0 time 26.549810
[0] 169.9 0.05 0.09 0
[0] 16384 4096 float sum -1[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 16384 Bytes -> Algo 0 proto 0 time 27.899622
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 16384 Bytes -> Algo 0 proto 0 time 27.899622
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 16384 Bytes -> Algo 0 proto 0 time 27.899622
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 16384 Bytes -> Algo 0 proto 0 time 27.899622
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 16384 Bytes -> Algo 0 proto 0 time 27.899622
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 16384 Bytes -> Algo 0 proto 0 time 27.899622
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 16384 Bytes -> Algo 0 proto 0 time 27.899622
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 16384 Bytes -> Algo 0 proto 0 time 27.899622
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 16384 Bytes -> Algo 0 proto 0 time 27.899622
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 16384 Bytes -> Algo 0 proto 0 time 27.899622
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 16384 Bytes -> Algo 0 proto 0 time 27.899622
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 16384 Bytes -> Algo 0 proto 0 time 27.899622
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 16384 Bytes -> Algo 0 proto 0 time 27.899622
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 16384 Bytes -> Algo 0 proto 0 time 27.899622
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 16384 Bytes -> Algo 0 proto 0 time 27.899622
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 16384 Bytes -> Algo 0 proto 0 time 27.899622
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 16384 Bytes -> Algo 0 proto 0 time 27.899622
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 16384 Bytes -> Algo 0 proto 0 time 27.899622
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 16384 Bytes -> Algo 0 proto 0 time 27.899622
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 16384 Bytes -> Algo 0 proto 0 time 27.899622
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 16384 Bytes -> Algo 0 proto 0 time 27.899622
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 16384 Bytes -> Algo 0 proto 0 time 27.899622
[0] 249.8 0.07 0.13 0[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 16384 Bytes -> Algo 0 proto 0 time 27.899622
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 16384 Bytes -> Algo 0 proto 0 time 27.899622
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 16384 Bytes -> Algo 0 proto 0 time 27.899622
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 16384 Bytes -> Algo 0 proto 0 time 27.899622
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 16384 Bytes -> Algo 0 proto 0 time 27.899622
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 16384 Bytes -> Algo 0 proto 0 time 27.899622
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 16384 Bytes -> Algo 0 proto 0 time 27.899622
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 16384 Bytes -> Algo 0 proto 0 time 27.899622
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 16384 Bytes -> Algo 0 proto 0 time 27.899622
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 16384 Bytes -> Algo 0 proto 0 time 27.899622
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 16384 Bytes -> Algo 0 proto 0 time 27.899622
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 16384 Bytes -> Algo 0 proto 0 time 27.899622
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 16384 Bytes -> Algo 0 proto 0 time 27.899622
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 16384 Bytes -> Algo 0 proto 0 time 27.899622
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 16384 Bytes -> Algo 0 proto 0 time 27.899622
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 16384 Bytes -> Algo 0 proto 0 time 27.899622
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 16384 Bytes -> Algo 0 proto 0 time 27.899622
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 16384 Bytes -> Algo 0 proto 0 time 27.899622
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 16384 Bytes -> Algo 0 proto 0 time 27.899622
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 16384 Bytes -> Algo 0 proto 0 time 27.899622
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 16384 Bytes -> Algo 0 proto 0 time 27.899622
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 16384 Bytes -> Algo 0 proto 0 time 27.899622
[0] 190.3 0.09 0.16 0
[0] 32768 8192 float sum -1[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 32768 Bytes -> Algo 0 proto 0 time 30.599243
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 32768 Bytes -> Algo 0 proto 0 time 30.599243
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 32768 Bytes -> Algo 0 proto 0 time 30.599243
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 32768 Bytes -> Algo 0 proto 0 time 30.599243
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 32768 Bytes -> Algo 0 proto 0 time 30.599243
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 32768 Bytes -> Algo 0 proto 0 time 30.599243
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 32768 Bytes -> Algo 0 proto 0 time 30.599243
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 32768 Bytes -> Algo 0 proto 0 time 30.599243
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 32768 Bytes -> Algo 0 proto 0 time 30.599243
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 32768 Bytes -> Algo 0 proto 0 time 30.599243
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 32768 Bytes -> Algo 0 proto 0 time 30.599243
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 32768 Bytes -> Algo 0 proto 0 time 30.599243
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 32768 Bytes -> Algo 0 proto 0 time 30.599243
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 32768 Bytes -> Algo 0 proto 0 time 30.599243
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 32768 Bytes -> Algo 0 proto 0 time 30.599243
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 32768 Bytes -> Algo 0 proto 0 time 30.599243
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 32768 Bytes -> Algo 0 proto 0 time 30.599243
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 32768 Bytes -> Algo 0 proto 0 time 30.599243
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 32768 Bytes -> Algo 0 proto 0 time 30.599243
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 32768 Bytes -> Algo 0 proto 0 time 30.599243
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 32768 Bytes -> Algo 0 proto 0 time 30.599243
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 32768 Bytes -> Algo 0 proto 0 time 30.599243
[0] 288.6 0.11 0.22 0[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 32768 Bytes -> Algo 0 proto 0 time 30.599243
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 32768 Bytes -> Algo 0 proto 0 time 30.599243
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 32768 Bytes -> Algo 0 proto 0 time 30.599243
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 32768 Bytes -> Algo 0 proto 0 time 30.599243
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 32768 Bytes -> Algo 0 proto 0 time 30.599243
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 32768 Bytes -> Algo 0 proto 0 time 30.599243
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 32768 Bytes -> Algo 0 proto 0 time 30.599243
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 32768 Bytes -> Algo 0 proto 0 time 30.599243
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 32768 Bytes -> Algo 0 proto 0 time 30.599243
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 32768 Bytes -> Algo 0 proto 0 time 30.599243
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 32768 Bytes -> Algo 0 proto 0 time 30.599243
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 32768 Bytes -> Algo 0 proto 0 time 30.599243
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 32768 Bytes -> Algo 0 proto 0 time 30.599243
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 32768 Bytes -> Algo 0 proto 0 time 30.599243
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 32768 Bytes -> Algo 0 proto 0 time 30.599243
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 32768 Bytes -> Algo 0 proto 0 time 30.599243
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 32768 Bytes -> Algo 0 proto 0 time 30.599243
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 32768 Bytes -> Algo 0 proto 0 time 30.599243
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 32768 Bytes -> Algo 0 proto 0 time 30.599243
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 32768 Bytes -> Algo 0 proto 0 time 30.599243
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 32768 Bytes -> Algo 0 proto 0 time 30.599243
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 32768 Bytes -> Algo 0 proto 0 time 30.599243
[0] 190.6 0.17 0.33 0
[0] 65536 16384 float sum -1[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 65536 Bytes -> Algo 0 proto 0 time 37.798233
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 65536 Bytes -> Algo 0 proto 0 time 37.798233
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 65536 Bytes -> Algo 0 proto 0 time 37.798233
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 65536 Bytes -> Algo 0 proto 0 time 37.798233
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 65536 Bytes -> Algo 0 proto 0 time 37.798233
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 65536 Bytes -> Algo 0 proto 0 time 37.798233
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 65536 Bytes -> Algo 0 proto 0 time 37.798233
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 65536 Bytes -> Algo 0 proto 0 time 37.798233
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 65536 Bytes -> Algo 0 proto 0 time 37.798233
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 65536 Bytes -> Algo 0 proto 0 time 37.798233
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 65536 Bytes -> Algo 0 proto 0 time 37.798233
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 65536 Bytes -> Algo 0 proto 0 time 37.798233
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 65536 Bytes -> Algo 0 proto 0 time 37.798233
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 65536 Bytes -> Algo 0 proto 0 time 37.798233
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 65536 Bytes -> Algo 0 proto 0 time 37.798233
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 65536 Bytes -> Algo 0 proto 0 time 37.798233
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 65536 Bytes -> Algo 0 proto 0 time 37.798233
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 65536 Bytes -> Algo 0 proto 0 time 37.798233
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 65536 Bytes -> Algo 0 proto 0 time 37.798233
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 65536 Bytes -> Algo 0 proto 0 time 37.798233
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 65536 Bytes -> Algo 0 proto 0 time 37.798233
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 65536 Bytes -> Algo 0 proto 0 time 37.798233
[0] 497.5 0.13 0.25 0[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 65536 Bytes -> Algo 0 proto 0 time 37.798233
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 65536 Bytes -> Algo 0 proto 0 time 37.798233
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 65536 Bytes -> Algo 0 proto 0 time 37.798233
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 65536 Bytes -> Algo 0 proto 0 time 37.798233
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 65536 Bytes -> Algo 0 proto 0 time 37.798233
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 65536 Bytes -> Algo 0 proto 0 time 37.798233
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 65536 Bytes -> Algo 0 proto 0 time 37.798233
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 65536 Bytes -> Algo 0 proto 0 time 37.798233
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 65536 Bytes -> Algo 0 proto 0 time 37.798233
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 65536 Bytes -> Algo 0 proto 0 time 37.798233
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 65536 Bytes -> Algo 0 proto 0 time 37.798233
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 65536 Bytes -> Algo 0 proto 0 time 37.798233
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 65536 Bytes -> Algo 0 proto 0 time 37.798233
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 65536 Bytes -> Algo 0 proto 0 time 37.798233
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 65536 Bytes -> Algo 0 proto 0 time 37.798233
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 65536 Bytes -> Algo 0 proto 0 time 37.798233
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 65536 Bytes -> Algo 0 proto 0 time 37.798233
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 65536 Bytes -> Algo 0 proto 0 time 37.798233
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 65536 Bytes -> Algo 0 proto 0 time 37.798233
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 65536 Bytes -> Algo 0 proto 0 time 37.798233
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 65536 Bytes -> Algo 0 proto 0 time 37.798233
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 65536 Bytes -> Algo 0 proto 0 time 37.798233
[0] 345.6 0.19 0.36 0
[0] 131072 32768 float sum -1[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 131072 Bytes -> Algo 0 proto 1 time 51.603912
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 131072 Bytes -> Algo 0 proto 1 time 51.603912
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 131072 Bytes -> Algo 0 proto 1 time 51.603912
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 131072 Bytes -> Algo 0 proto 1 time 51.603912
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 131072 Bytes -> Algo 0 proto 1 time 51.603912
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 131072 Bytes -> Algo 0 proto 1 time 51.603912
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 131072 Bytes -> Algo 0 proto 1 time 51.603912
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 131072 Bytes -> Algo 0 proto 1 time 51.603912
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 131072 Bytes -> Algo 0 proto 1 time 51.603912
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 131072 Bytes -> Algo 0 proto 1 time 51.603912
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 131072 Bytes -> Algo 0 proto 1 time 51.603912
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 131072 Bytes -> Algo 0 proto 1 time 51.603912
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 131072 Bytes -> Algo 0 proto 1 time 51.603912
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 131072 Bytes -> Algo 0 proto 1 time 51.603912
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 131072 Bytes -> Algo 0 proto 1 time 51.603912
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 131072 Bytes -> Algo 0 proto 1 time 51.603912
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 131072 Bytes -> Algo 0 proto 1 time 51.603912
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 131072 Bytes -> Algo 0 proto 1 time 51.603912
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 131072 Bytes -> Algo 0 proto 1 time 51.603912
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 131072 Bytes -> Algo 0 proto 1 time 51.603912
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 131072 Bytes -> Algo 0 proto 1 time 51.603912
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 131072 Bytes -> Algo 0 proto 1 time 51.603912
[0] 402.5 0.33 0.62 0[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 131072 Bytes -> Algo 0 proto 1 time 51.603912
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 131072 Bytes -> Algo 0 proto 1 time 51.603912
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 131072 Bytes -> Algo 0 proto 1 time 51.603912
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 131072 Bytes -> Algo 0 proto 1 time 51.603912
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 131072 Bytes -> Algo 0 proto 1 time 51.603912
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 131072 Bytes -> Algo 0 proto 1 time 51.603912
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 131072 Bytes -> Algo 0 proto 1 time 51.603912
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 131072 Bytes -> Algo 0 proto 1 time 51.603912
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 131072 Bytes -> Algo 0 proto 1 time 51.603912
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 131072 Bytes -> Algo 0 proto 1 time 51.603912
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 131072 Bytes -> Algo 0 proto 1 time 51.603912
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 131072 Bytes -> Algo 0 proto 1 time 51.603912
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 131072 Bytes -> Algo 0 proto 1 time 51.603912
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 131072 Bytes -> Algo 0 proto 1 time 51.603912
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 131072 Bytes -> Algo 0 proto 1 time 51.603912
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 131072 Bytes -> Algo 0 proto 1 time 51.603912
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 131072 Bytes -> Algo 0 proto 1 time 51.603912
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 131072 Bytes -> Algo 0 proto 1 time 51.603912
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 131072 Bytes -> Algo 0 proto 1 time 51.603912
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 131072 Bytes -> Algo 0 proto 1 time 51.603912
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 131072 Bytes -> Algo 0 proto 1 time 51.603912
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 131072 Bytes -> Algo 0 proto 1 time 51.603912
[0] 222.5 0.59 1.13 0
[0] 262144 65536 float sum -1[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 262144 Bytes -> Algo 0 proto 1 time 54.707825
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 262144 Bytes -> Algo 0 proto 1 time 54.707825
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 262144 Bytes -> Algo 0 proto 1 time 54.707825
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 262144 Bytes -> Algo 0 proto 1 time 54.707825
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 262144 Bytes -> Algo 0 proto 1 time 54.707825
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 262144 Bytes -> Algo 0 proto 1 time 54.707825
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 262144 Bytes -> Algo 0 proto 1 time 54.707825
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 262144 Bytes -> Algo 0 proto 1 time 54.707825
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 262144 Bytes -> Algo 0 proto 1 time 54.707825
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 262144 Bytes -> Algo 0 proto 1 time 54.707825
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 262144 Bytes -> Algo 0 proto 1 time 54.707825
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 262144 Bytes -> Algo 0 proto 1 time 54.707825
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 262144 Bytes -> Algo 0 proto 1 time 54.707825
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 262144 Bytes -> Algo 0 proto 1 time 54.707825
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 262144 Bytes -> Algo 0 proto 1 time 54.707825
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 262144 Bytes -> Algo 0 proto 1 time 54.707825
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 262144 Bytes -> Algo 0 proto 1 time 54.707825
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 262144 Bytes -> Algo 0 proto 1 time 54.707825
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 262144 Bytes -> Algo 0 proto 1 time 54.707825
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 262144 Bytes -> Algo 0 proto 1 time 54.707825
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 262144 Bytes -> Algo 0 proto 1 time 54.707825
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 262144 Bytes -> Algo 0 proto 1 time 54.707825
[0] 281.5 0.93 1.79 0h100clust-worker-1:497888:497888 [0] NCCL INFO 262144 Bytes -> Algo 0 proto 1 time 54.707825
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 262144 Bytes -> Algo 0 proto 1 time 54.707825
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 262144 Bytes -> Algo 0 proto 1 time 54.707825
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 262144 Bytes -> Algo 0 proto 1 time 54.707825
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 262144 Bytes -> Algo 0 proto 1 time 54.707825
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 262144 Bytes -> Algo 0 proto 1 time 54.707825
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 262144 Bytes -> Algo 0 proto 1 time 54.707825
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 262144 Bytes -> Algo 0 proto 1 time 54.707825
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 262144 Bytes -> Algo 0 proto 1 time 54.707825
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 262144 Bytes -> Algo 0 proto 1 time 54.707825
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 262144 Bytes -> Algo 0 proto 1 time 54.707825
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 262144 Bytes -> Algo 0 proto 1 time 54.707825
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 262144 Bytes -> Algo 0 proto 1 time 54.707825
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 262144 Bytes -> Algo 0 proto 1 time 54.707825
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 262144 Bytes -> Algo 0 proto 1 time 54.707825
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 262144 Bytes -> Algo 0 proto 1 time 54.707825
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 262144 Bytes -> Algo 0 proto 1 time 54.707825
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 262144 Bytes -> Algo 0 proto 1 time 54.707825
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 262144 Bytes -> Algo 0 proto 1 time 54.707825
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 262144 Bytes -> Algo 0 proto 1 time 54.707825
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 262144 Bytes -> Algo 0 proto 1 time 54.707825
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 262144 Bytes -> Algo 0 proto 1 time 54.707825
[0] 230.3 1.14 2.18 0
[0] 524288 131072 float sum -1h100clust-worker-1:497888:497888 [0] NCCL INFO 524288 Bytes -> Algo 5 proto 2 time 59.553600
[0] h100clust-worker-1:497888:498098 [0] NCCL INFO NVLS comm 0x561097745b80 headRank 0 nHeads 8 buffSize 1048576 nvlsPerRankSize 33554432 nvlsTotalSize 268435456
[0] h100clust-worker-1:497888:498098 [0] NCCL INFO Channel 01/0 : 16[0] -> 0[0] [receive] via NET/IBext_v8/7/GDRDMA
[0] h100clust-worker-1:497888:498098 [0] NCCL INFO Channel 02/0 : 16[0] -> 0[0] [receive] via NET/IBext_v8/7/GDRDMA
[0] h100clust-worker-1:497888:498098 [0] NCCL INFO Channel 03/0 : 16[0] -> 0[0] [receive] via NET/IBext_v8/7/GDRDMA
[0] h100clust-worker-1:497888:498098 [0] NCCL INFO Channel 04/0 : 16[0] -> 0[0] [receive] via NET/IBext_v8/7/GDRDMA
[0] h100clust-worker-1:497888:498098 [0] NCCL INFO Channel 05/0 : 16[0] -> 0[0] [receive] via NET/IBext_v8/7/GDRDMA
[0] h100clust-worker-1:497888:498098 [0] NCCL INFO Channel 06/0 : 16[0] -> 0[0] [receive] via NET/IBext_v8/7/GDRDMA
[0] h100clust-worker-1:497888:498098 [0] NCCL INFO Channel 07/0 : 16[0] -> 0[0] [receive] via NET/IBext_v8/7/GDRDMA
[0] h100clust-worker-1:497888:498098 [0] NCCL INFO Channel 09/0 : 16[0] -> 0[0] [receive] via NET/IBext_v8/7/GDRDMA
[0] h100clust-worker-1:497888:498098 [0] NCCL INFO Channel 10/0 : 16[0] -> 0[0] [receive] via NET/IBext_v8/7/GDRDMA
[0] h100clust-worker-1:497888:498098 [0] NCCL INFO Channel 11/0 : 16[0] -> 0[0] [receive] via NET/IBext_v8/7/GDRDMA
[0] h100clust-worker-1:497888:498098 [0] NCCL INFO Channel 12/0 : 16[0] -> 0[0] [receive] via NET/IBext_v8/7/GDRDMA
[0] h100clust-worker-1:497888:498098 [0] NCCL INFO Channel 13/0 : 16[0] -> 0[0] [receive] via NET/IBext_v8/7/GDRDMA
[0] h100clust-worker-1:497888:498098 [0] NCCL INFO Channel 14/0 : 16[0] -> 0[0] [receive] via NET/IBext_v8/7/GDRDMA
[0] h100clust-worker-1:497888:498098 [0] NCCL INFO Channel 15/0 : 16[0] -> 0[0] [receive] via NET/IBext_v8/7/GDRDMA
[0] h100clust-worker-1:497888:498098 [0] NCCL INFO Channel 01/0 : 0[0] -> 8[0] [send] via NET/IBext_v8/7/GDRDMA
[0] h100clust-worker-1:497888:498098 [0] NCCL INFO Channel 03/0 : 0[0] -> 8[0] [send] via NET/IBext_v8/7/GDRDMA
[0] h100clust-worker-1:497888:498098 [0] NCCL INFO Channel 05/0 : 0[0] -> 8[0] [send] via NET/IBext_v8/7/GDRDMA
[0] h100clust-worker-1:497888:498098 [0] NCCL INFO Channel 07/0 : 0[0] -> 8[0] [send] via NET/IBext_v8/7/GDRDMA
[0] h100clust-worker-1:497888:498098 [0] NCCL INFO Channel 09/0 : 0[0] -> 8[0] [send] via NET/IBext_v8/7/GDRDMA
[0] h100clust-worker-1:497888:498098 [0] NCCL INFO Channel 11/0 : 0[0] -> 8[0] [send] via NET/IBext_v8/7/GDRDMA
[0] h100clust-worker-1:497888:498098 [0] NCCL INFO Channel 13/0 : 0[0] -> 8[0] [send] via NET/IBext_v8/7/GDRDMA
[0] h100clust-worker-1:497888:498098 [0] NCCL INFO Channel 15/0 : 0[0] -> 8[0] [send] via NET/IBext_v8/7/GDRDMA
[0] h100clust-worker-1:497888:498098 [0] NCCL INFO Channel 01/0 : 8[0] -> 0[0] [receive] via NET/IBext_v8/7/GDRDMA
[0] h100clust-worker-1:497888:498098 [0] NCCL INFO Channel 03/0 : 8[0] -> 0[0] [receive] via NET/IBext_v8/7/GDRDMA
[0] h100clust-worker-1:497888:498098 [0] NCCL INFO Channel 05/0 : 8[0] -> 0[0] [receive] via NET/IBext_v8/7/GDRDMA
[0] h100clust-worker-1:497888:498098 [0] NCCL INFO Channel 07/0 : 8[0] -> 0[0] [receive] via NET/IBext_v8/7/GDRDMA
[0] h100clust-worker-1:497888:498098 [0] NCCL INFO Channel 09/0 : 8[0] -> 0[0] [receive] via NET/IBext_v8/7/GDRDMA
[0] h100clust-worker-1:497888:498098 [0] NCCL INFO Channel 11/0 : 8[0] -> 0[0] [receive] via NET/IBext_v8/7/GDRDMA
[0] h100clust-worker-1:497888:498098 [0] NCCL INFO Channel 13/0 : 8[0] -> 0[0] [receive] via NET/IBext_v8/7/GDRDMA
[0] h100clust-worker-1:497888:498098 [0] NCCL INFO Channel 15/0 : 8[0] -> 0[0] [receive] via NET/IBext_v8/7/GDRDMA
[0] h100clust-worker-1:497888:498098 [0] NCCL INFO Channel 01/0 : 0[0] -> 16[0] [send] via NET/IBext_v8/7/GDRDMA
[0] h100clust-worker-1:497888:498098 [0] NCCL INFO Channel 02/0 : 0[0] -> 16[0] [send] via NET/IBext_v8/7/GDRDMA
[0] h100clust-worker-1:497888:498098 [0] NCCL INFO Channel 03/0 : 0[0] -> 16[0] [send] via NET/IBext_v8/7/GDRDMA
[0] h100clust-worker-1:497888:498098 [0] NCCL INFO Channel 04/0 : 0[0] -> 16[0] [send] via NET/IBext_v8/7/GDRDMA
[0] h100clust-worker-1:497888:498098 [0] NCCL INFO Channel 05/0 : 0[0] -> 16[0] [send] via NET/IBext_v8/7/GDRDMA
[0] h100clust-worker-1:497888:498098 [0] NCCL INFO Channel 06/0 : 0[0] -> 16[0] [send] via NET/IBext_v8/7/GDRDMA
[0] h100clust-worker-1:497888:498098 [0] NCCL INFO Channel 07/0 : 0[0] -> 16[0] [send] via NET/IBext_v8/7/GDRDMA
[0] h100clust-worker-1:497888:498098 [0] NCCL INFO Channel 09/0 : 0[0] -> 16[0] [send] via NET/IBext_v8/7/GDRDMA
[0] h100clust-worker-1:497888:498098 [0] NCCL INFO Channel 10/0 : 0[0] -> 16[0] [send] via NET/IBext_v8/7/GDRDMA
[0] h100clust-worker-1:497888:498098 [0] NCCL INFO Channel 11/0 : 0[0] -> 16[0] [send] via NET/IBext_v8/7/GDRDMA
[0] h100clust-worker-1:497888:498098 [0] NCCL INFO Channel 12/0 : 0[0] -> 16[0] [send] via NET/IBext_v8/7/GDRDMA
[0] h100clust-worker-1:497888:498098 [0] NCCL INFO Channel 13/0 : 0[0] -> 16[0] [send] via NET/IBext_v8/7/GDRDMA
[0] h100clust-worker-1:497888:498098 [0] NCCL INFO Channel 14/0 : 0[0] -> 16[0] [send] via NET/IBext_v8/7/GDRDMA
[0] h100clust-worker-1:497888:498098 [0] NCCL INFO Channel 15/0 : 0[0] -> 16[0] [send] via NET/IBext_v8/7/GDRDMA
[0] h100clust-worker-1:497888:498098 [0] NCCL INFO Connected NVLS tree
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 524288 Bytes -> Algo 5 proto 2 time 59.553600
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 524288 Bytes -> Algo 5 proto 2 time 59.553600
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 524288 Bytes -> Algo 5 proto 2 time 59.553600
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 524288 Bytes -> Algo 5 proto 2 time 59.553600
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 524288 Bytes -> Algo 5 proto 2 time 59.553600
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 524288 Bytes -> Algo 5 proto 2 time 59.553600
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 524288 Bytes -> Algo 5 proto 2 time 59.553600
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 524288 Bytes -> Algo 5 proto 2 time 59.553600
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 524288 Bytes -> Algo 5 proto 2 time 59.553600
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 524288 Bytes -> Algo 5 proto 2 time 59.553600
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 524288 Bytes -> Algo 5 proto 2 time 59.553600
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 524288 Bytes -> Algo 5 proto 2 time 59.553600
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 524288 Bytes -> Algo 5 proto 2 time 59.553600
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 524288 Bytes -> Algo 5 proto 2 time 59.553600
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 524288 Bytes -> Algo 5 proto 2 time 59.553600
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 524288 Bytes -> Algo 5 proto 2 time 59.553600
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 524288 Bytes -> Algo 5 proto 2 time 59.553600
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 524288 Bytes -> Algo 5 proto 2 time 59.553600
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 524288 Bytes -> Algo 5 proto 2 time 59.553600
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 524288 Bytes -> Algo 5 proto 2 time 59.553600
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 524288 Bytes -> Algo 5 proto 2 time 59.553600
[0] 380.2 1.38 2.64 0[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 524288 Bytes -> Algo 5 proto 2 time 59.553600
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 524288 Bytes -> Algo 5 proto 2 time 59.553600
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 524288 Bytes -> Algo 5 proto 2 time 59.553600
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 524288 Bytes -> Algo 5 proto 2 time 59.553600
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 524288 Bytes -> Algo 5 proto 2 time 59.553600
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 524288 Bytes -> Algo 5 proto 2 time 59.553600
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 524288 Bytes -> Algo 5 proto 2 time 59.553600
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 524288 Bytes -> Algo 5 proto 2 time 59.553600
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 524288 Bytes -> Algo 5 proto 2 time 59.553600
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 524288 Bytes -> Algo 5 proto 2 time 59.553600
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 524288 Bytes -> Algo 5 proto 2 time 59.553600
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 524288 Bytes -> Algo 5 proto 2 time 59.553600
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 524288 Bytes -> Algo 5 proto 2 time 59.553600
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 524288 Bytes -> Algo 5 proto 2 time 59.553600
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 524288 Bytes -> Algo 5 proto 2 time 59.553600
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 524288 Bytes -> Algo 5 proto 2 time 59.553600
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 524288 Bytes -> Algo 5 proto 2 time 59.553600
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 524288 Bytes -> Algo 5 proto 2 time 59.553600
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 524288 Bytes -> Algo 5 proto 2 time 59.553600
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 524288 Bytes -> Algo 5 proto 2 time 59.553600
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 524288 Bytes -> Algo 5 proto 2 time 59.553600
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 524288 Bytes -> Algo 5 proto 2 time 59.553600
[0] 329.8 1.59 3.05 0
[0] 1048576 262144 float sum -1h100clust-worker-1:497888:497888 [0] NCCL INFO 1048576 Bytes -> Algo 5 proto 2 time 66.107201
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 1048576 Bytes -> Algo 5 proto 2 time 66.107201
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 1048576 Bytes -> Algo 5 proto 2 time 66.107201
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 1048576 Bytes -> Algo 5 proto 2 time 66.107201
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 1048576 Bytes -> Algo 5 proto 2 time 66.107201
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 1048576 Bytes -> Algo 5 proto 2 time 66.107201
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 1048576 Bytes -> Algo 5 proto 2 time 66.107201
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 1048576 Bytes -> Algo 5 proto 2 time 66.107201
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 1048576 Bytes -> Algo 5 proto 2 time 66.107201
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 1048576 Bytes -> Algo 5 proto 2 time 66.107201
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 1048576 Bytes -> Algo 5 proto 2 time 66.107201
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 1048576 Bytes -> Algo 5 proto 2 time 66.107201
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 1048576 Bytes -> Algo 5 proto 2 time 66.107201
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 1048576 Bytes -> Algo 5 proto 2 time 66.107201
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 1048576 Bytes -> Algo 5 proto 2 time 66.107201
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 1048576 Bytes -> Algo 5 proto 2 time 66.107201
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 1048576 Bytes -> Algo 5 proto 2 time 66.107201
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 1048576 Bytes -> Algo 5 proto 2 time 66.107201
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 1048576 Bytes -> Algo 5 proto 2 time 66.107201
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 1048576 Bytes -> Algo 5 proto 2 time 66.107201
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 1048576 Bytes -> Algo 5 proto 2 time 66.107201
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 1048576 Bytes -> Algo 5 proto 2 time 66.107201
[0] 1851.1 0.57 1.09 0h100clust-worker-1:497888:497888 [0] NCCL INFO 1048576 Bytes -> Algo 5 proto 2 time 66.107201
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 1048576 Bytes -> Algo 5 proto 2 time 66.107201
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 1048576 Bytes -> Algo 5 proto 2 time 66.107201
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 1048576 Bytes -> Algo 5 proto 2 time 66.107201
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 1048576 Bytes -> Algo 5 proto 2 time 66.107201
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 1048576 Bytes -> Algo 5 proto 2 time 66.107201
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 1048576 Bytes -> Algo 5 proto 2 time 66.107201
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 1048576 Bytes -> Algo 5 proto 2 time 66.107201
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 1048576 Bytes -> Algo 5 proto 2 time 66.107201
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 1048576 Bytes -> Algo 5 proto 2 time 66.107201
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 1048576 Bytes -> Algo 5 proto 2 time 66.107201
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 1048576 Bytes -> Algo 5 proto 2 time 66.107201
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 1048576 Bytes -> Algo 5 proto 2 time 66.107201
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 1048576 Bytes -> Algo 5 proto 2 time 66.107201
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 1048576 Bytes -> Algo 5 proto 2 time 66.107201
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 1048576 Bytes -> Algo 5 proto 2 time 66.107201
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 1048576 Bytes -> Algo 5 proto 2 time 66.107201
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 1048576 Bytes -> Algo 5 proto 2 time 66.107201
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 1048576 Bytes -> Algo 5 proto 2 time 66.107201
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 1048576 Bytes -> Algo 5 proto 2 time 66.107201
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 1048576 Bytes -> Algo 5 proto 2 time 66.107201
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 1048576 Bytes -> Algo 5 proto 2 time 66.107201
[0] 2291.8 0.46 0.88 0
[0] 2097152 524288 float sum -1h100clust-worker-1:497888:497888 [0] NCCL INFO 2097152 Bytes -> Algo 5 proto 2 time 79.214401
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 2097152 Bytes -> Algo 5 proto 2 time 79.214401
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 2097152 Bytes -> Algo 5 proto 2 time 79.214401
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 2097152 Bytes -> Algo 5 proto 2 time 79.214401
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 2097152 Bytes -> Algo 5 proto 2 time 79.214401
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 2097152 Bytes -> Algo 5 proto 2 time 79.214401
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 2097152 Bytes -> Algo 5 proto 2 time 79.214401
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 2097152 Bytes -> Algo 5 proto 2 time 79.214401
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 2097152 Bytes -> Algo 5 proto 2 time 79.214401
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 2097152 Bytes -> Algo 5 proto 2 time 79.214401
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 2097152 Bytes -> Algo 5 proto 2 time 79.214401
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 2097152 Bytes -> Algo 5 proto 2 time 79.214401
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 2097152 Bytes -> Algo 5 proto 2 time 79.214401
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 2097152 Bytes -> Algo 5 proto 2 time 79.214401
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 2097152 Bytes -> Algo 5 proto 2 time 79.214401
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 2097152 Bytes -> Algo 5 proto 2 time 79.214401
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 2097152 Bytes -> Algo 5 proto 2 time 79.214401
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 2097152 Bytes -> Algo 5 proto 2 time 79.214401
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 2097152 Bytes -> Algo 5 proto 2 time 79.214401
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 2097152 Bytes -> Algo 5 proto 2 time 79.214401
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 2097152 Bytes -> Algo 5 proto 2 time 79.214401
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 2097152 Bytes -> Algo 5 proto 2 time 79.214401
[0] 2452.2 0.86 1.64 0h100clust-worker-1:497888:497888 [0] NCCL INFO 2097152 Bytes -> Algo 5 proto 2 time 79.214401
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 2097152 Bytes -> Algo 5 proto 2 time 79.214401
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 2097152 Bytes -> Algo 5 proto 2 time 79.214401
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 2097152 Bytes -> Algo 5 proto 2 time 79.214401
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 2097152 Bytes -> Algo 5 proto 2 time 79.214401
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 2097152 Bytes -> Algo 5 proto 2 time 79.214401
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 2097152 Bytes -> Algo 5 proto 2 time 79.214401
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 2097152 Bytes -> Algo 5 proto 2 time 79.214401
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 2097152 Bytes -> Algo 5 proto 2 time 79.214401
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 2097152 Bytes -> Algo 5 proto 2 time 79.214401
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 2097152 Bytes -> Algo 5 proto 2 time 79.214401
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 2097152 Bytes -> Algo 5 proto 2 time 79.214401
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 2097152 Bytes -> Algo 5 proto 2 time 79.214401
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 2097152 Bytes -> Algo 5 proto 2 time 79.214401
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 2097152 Bytes -> Algo 5 proto 2 time 79.214401
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 2097152 Bytes -> Algo 5 proto 2 time 79.214401
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 2097152 Bytes -> Algo 5 proto 2 time 79.214401
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 2097152 Bytes -> Algo 5 proto 2 time 79.214401
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 2097152 Bytes -> Algo 5 proto 2 time 79.214401
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 2097152 Bytes -> Algo 5 proto 2 time 79.214401
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 2097152 Bytes -> Algo 5 proto 2 time 79.214401
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 2097152 Bytes -> Algo 5 proto 2 time 79.214401
[0] 393.8 5.33 10.21 0
[0] 4194304 1048576 float sum -1h100clust-worker-1:497888:497888 [0] NCCL INFO 4194304 Bytes -> Algo 5 proto 2 time 105.428802
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 4194304 Bytes -> Algo 5 proto 2 time 105.428802
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 4194304 Bytes -> Algo 5 proto 2 time 105.428802
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 4194304 Bytes -> Algo 5 proto 2 time 105.428802
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 4194304 Bytes -> Algo 5 proto 2 time 105.428802
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 4194304 Bytes -> Algo 5 proto 2 time 105.428802
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 4194304 Bytes -> Algo 5 proto 2 time 105.428802
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 4194304 Bytes -> Algo 5 proto 2 time 105.428802
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 4194304 Bytes -> Algo 5 proto 2 time 105.428802
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 4194304 Bytes -> Algo 5 proto 2 time 105.428802
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 4194304 Bytes -> Algo 5 proto 2 time 105.428802
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 4194304 Bytes -> Algo 5 proto 2 time 105.428802
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 4194304 Bytes -> Algo 5 proto 2 time 105.428802
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 4194304 Bytes -> Algo 5 proto 2 time 105.428802
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 4194304 Bytes -> Algo 5 proto 2 time 105.428802
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 4194304 Bytes -> Algo 5 proto 2 time 105.428802
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 4194304 Bytes -> Algo 5 proto 2 time 105.428802
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 4194304 Bytes -> Algo 5 proto 2 time 105.428802
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 4194304 Bytes -> Algo 5 proto 2 time 105.428802
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 4194304 Bytes -> Algo 5 proto 2 time 105.428802
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 4194304 Bytes -> Algo 5 proto 2 time 105.428802
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 4194304 Bytes -> Algo 5 proto 2 time 105.428802
[0] 621.9 6.74 12.93 0h100clust-worker-1:497888:497888 [0] NCCL INFO 4194304 Bytes -> Algo 5 proto 2 time 105.428802
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 4194304 Bytes -> Algo 5 proto 2 time 105.428802
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 4194304 Bytes -> Algo 5 proto 2 time 105.428802
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 4194304 Bytes -> Algo 5 proto 2 time 105.428802
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 4194304 Bytes -> Algo 5 proto 2 time 105.428802
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 4194304 Bytes -> Algo 5 proto 2 time 105.428802
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 4194304 Bytes -> Algo 5 proto 2 time 105.428802
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 4194304 Bytes -> Algo 5 proto 2 time 105.428802
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 4194304 Bytes -> Algo 5 proto 2 time 105.428802
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 4194304 Bytes -> Algo 5 proto 2 time 105.428802
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 4194304 Bytes -> Algo 5 proto 2 time 105.428802
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 4194304 Bytes -> Algo 5 proto 2 time 105.428802
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 4194304 Bytes -> Algo 5 proto 2 time 105.428802
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 4194304 Bytes -> Algo 5 proto 2 time 105.428802
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 4194304 Bytes -> Algo 5 proto 2 time 105.428802
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 4194304 Bytes -> Algo 5 proto 2 time 105.428802
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 4194304 Bytes -> Algo 5 proto 2 time 105.428802
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 4194304 Bytes -> Algo 5 proto 2 time 105.428802
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 4194304 Bytes -> Algo 5 proto 2 time 105.428802
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 4194304 Bytes -> Algo 5 proto 2 time 105.428802
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 4194304 Bytes -> Algo 5 proto 2 time 105.428802
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 4194304 Bytes -> Algo 5 proto 2 time 105.428802
[0] 656.0 6.39 12.25 0
[0] 8388608 2097152 float sum -1[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 8388608 Bytes -> Algo 5 proto 2 time 157.857605
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 8388608 Bytes -> Algo 5 proto 2 time 157.857605
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 8388608 Bytes -> Algo 5 proto 2 time 157.857605
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 8388608 Bytes -> Algo 5 proto 2 time 157.857605
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 8388608 Bytes -> Algo 5 proto 2 time 157.857605
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 8388608 Bytes -> Algo 5 proto 2 time 157.857605
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 8388608 Bytes -> Algo 5 proto 2 time 157.857605
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 8388608 Bytes -> Algo 5 proto 2 time 157.857605
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 8388608 Bytes -> Algo 5 proto 2 time 157.857605
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 8388608 Bytes -> Algo 5 proto 2 time 157.857605
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 8388608 Bytes -> Algo 5 proto 2 time 157.857605
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 8388608 Bytes -> Algo 5 proto 2 time 157.857605
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 8388608 Bytes -> Algo 5 proto 2 time 157.857605
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 8388608 Bytes -> Algo 5 proto 2 time 157.857605
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 8388608 Bytes -> Algo 5 proto 2 time 157.857605
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 8388608 Bytes -> Algo 5 proto 2 time 157.857605
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 8388608 Bytes -> Algo 5 proto 2 time 157.857605
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 8388608 Bytes -> Algo 5 proto 2 time 157.857605
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 8388608 Bytes -> Algo 5 proto 2 time 157.857605
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 8388608 Bytes -> Algo 5 proto 2 time 157.857605
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 8388608 Bytes -> Algo 5 proto 2 time 157.857605
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 8388608 Bytes -> Algo 5 proto 2 time 157.857605
[0] 1008.0 8.32 15.95 0[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 8388608 Bytes -> Algo 5 proto 2 time 157.857605
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 8388608 Bytes -> Algo 5 proto 2 time 157.857605
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 8388608 Bytes -> Algo 5 proto 2 time 157.857605
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 8388608 Bytes -> Algo 5 proto 2 time 157.857605
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 8388608 Bytes -> Algo 5 proto 2 time 157.857605
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 8388608 Bytes -> Algo 5 proto 2 time 157.857605
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 8388608 Bytes -> Algo 5 proto 2 time 157.857605
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 8388608 Bytes -> Algo 5 proto 2 time 157.857605
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 8388608 Bytes -> Algo 5 proto 2 time 157.857605
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 8388608 Bytes -> Algo 5 proto 2 time 157.857605
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 8388608 Bytes -> Algo 5 proto 2 time 157.857605
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 8388608 Bytes -> Algo 5 proto 2 time 157.857605
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 8388608 Bytes -> Algo 5 proto 2 time 157.857605
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 8388608 Bytes -> Algo 5 proto 2 time 157.857605
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 8388608 Bytes -> Algo 5 proto 2 time 157.857605
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 8388608 Bytes -> Algo 5 proto 2 time 157.857605
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 8388608 Bytes -> Algo 5 proto 2 time 157.857605
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 8388608 Bytes -> Algo 5 proto 2 time 157.857605
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 8388608 Bytes -> Algo 5 proto 2 time 157.857605
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 8388608 Bytes -> Algo 5 proto 2 time 157.857605
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 8388608 Bytes -> Algo 5 proto 2 time 157.857605
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 8388608 Bytes -> Algo 5 proto 2 time 157.857605
[0] 1001.5 8.38 16.05 0
[0] 16777216 4194304 float sum -1[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 16777216 Bytes -> Algo 5 proto 2 time 262.715210
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 16777216 Bytes -> Algo 5 proto 2 time 262.715210
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 16777216 Bytes -> Algo 5 proto 2 time 262.715210
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 16777216 Bytes -> Algo 5 proto 2 time 262.715210
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 16777216 Bytes -> Algo 5 proto 2 time 262.715210
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 16777216 Bytes -> Algo 5 proto 2 time 262.715210
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 16777216 Bytes -> Algo 5 proto 2 time 262.715210
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 16777216 Bytes -> Algo 5 proto 2 time 262.715210
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 16777216 Bytes -> Algo 5 proto 2 time 262.715210
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 16777216 Bytes -> Algo 5 proto 2 time 262.715210
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 16777216 Bytes -> Algo 5 proto 2 time 262.715210
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 16777216 Bytes -> Algo 5 proto 2 time 262.715210
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 16777216 Bytes -> Algo 5 proto 2 time 262.715210
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 16777216 Bytes -> Algo 5 proto 2 time 262.715210
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 16777216 Bytes -> Algo 5 proto 2 time 262.715210
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 16777216 Bytes -> Algo 5 proto 2 time 262.715210
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 16777216 Bytes -> Algo 5 proto 2 time 262.715210
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 16777216 Bytes -> Algo 5 proto 2 time 262.715210
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 16777216 Bytes -> Algo 5 proto 2 time 262.715210
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 16777216 Bytes -> Algo 5 proto 2 time 262.715210
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 16777216 Bytes -> Algo 5 proto 2 time 262.715210
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 16777216 Bytes -> Algo 5 proto 2 time 262.715210
[0] 1295.6 12.95 24.82 0[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 16777216 Bytes -> Algo 5 proto 2 time 262.715210
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 16777216 Bytes -> Algo 5 proto 2 time 262.715210
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 16777216 Bytes -> Algo 5 proto 2 time 262.715210
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 16777216 Bytes -> Algo 5 proto 2 time 262.715210
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 16777216 Bytes -> Algo 5 proto 2 time 262.715210
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 16777216 Bytes -> Algo 5 proto 2 time 262.715210
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 16777216 Bytes -> Algo 5 proto 2 time 262.715210
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 16777216 Bytes -> Algo 5 proto 2 time 262.715210
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 16777216 Bytes -> Algo 5 proto 2 time 262.715210
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 16777216 Bytes -> Algo 5 proto 2 time 262.715210
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 16777216 Bytes -> Algo 5 proto 2 time 262.715210
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 16777216 Bytes -> Algo 5 proto 2 time 262.715210
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 16777216 Bytes -> Algo 5 proto 2 time 262.715210
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 16777216 Bytes -> Algo 5 proto 2 time 262.715210
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 16777216 Bytes -> Algo 5 proto 2 time 262.715210
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 16777216 Bytes -> Algo 5 proto 2 time 262.715210
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 16777216 Bytes -> Algo 5 proto 2 time 262.715210
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 16777216 Bytes -> Algo 5 proto 2 time 262.715210
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 16777216 Bytes -> Algo 5 proto 2 time 262.715210
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 16777216 Bytes -> Algo 5 proto 2 time 262.715210
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 16777216 Bytes -> Algo 5 proto 2 time 262.715210
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 16777216 Bytes -> Algo 5 proto 2 time 262.715210
[0] 1321.1 12.70 24.34 0
[0] 33554432 8388608 float sum -1[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 33554432 Bytes -> Algo 5 proto 2 time 472.430389
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 33554432 Bytes -> Algo 5 proto 2 time 472.430389
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 33554432 Bytes -> Algo 5 proto 2 time 472.430389
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 33554432 Bytes -> Algo 5 proto 2 time 472.430389
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 33554432 Bytes -> Algo 5 proto 2 time 472.430389
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 33554432 Bytes -> Algo 5 proto 2 time 472.430389
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 33554432 Bytes -> Algo 5 proto 2 time 472.430389
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 33554432 Bytes -> Algo 5 proto 2 time 472.430389
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 33554432 Bytes -> Algo 5 proto 2 time 472.430389
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 33554432 Bytes -> Algo 5 proto 2 time 472.430389
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 33554432 Bytes -> Algo 5 proto 2 time 472.430389
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 33554432 Bytes -> Algo 5 proto 2 time 472.430389
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 33554432 Bytes -> Algo 5 proto 2 time 472.430389
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 33554432 Bytes -> Algo 5 proto 2 time 472.430389
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 33554432 Bytes -> Algo 5 proto 2 time 472.430389
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 33554432 Bytes -> Algo 5 proto 2 time 472.430389
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 33554432 Bytes -> Algo 5 proto 2 time 472.430389
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 33554432 Bytes -> Algo 5 proto 2 time 472.430389
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 33554432 Bytes -> Algo 5 proto 2 time 472.430389
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 33554432 Bytes -> Algo 5 proto 2 time 472.430389
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 33554432 Bytes -> Algo 5 proto 2 time 472.430389
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 33554432 Bytes -> Algo 5 proto 2 time 472.430389
[0] 2239.8 14.98 28.71 0[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 33554432 Bytes -> Algo 5 proto 2 time 472.430389
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 33554432 Bytes -> Algo 5 proto 2 time 472.430389
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 33554432 Bytes -> Algo 5 proto 2 time 472.430389
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 33554432 Bytes -> Algo 5 proto 2 time 472.430389
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 33554432 Bytes -> Algo 5 proto 2 time 472.430389
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 33554432 Bytes -> Algo 5 proto 2 time 472.430389
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 33554432 Bytes -> Algo 5 proto 2 time 472.430389
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 33554432 Bytes -> Algo 5 proto 2 time 472.430389
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 33554432 Bytes -> Algo 5 proto 2 time 472.430389
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 33554432 Bytes -> Algo 5 proto 2 time 472.430389
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 33554432 Bytes -> Algo 5 proto 2 time 472.430389
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 33554432 Bytes -> Algo 5 proto 2 time 472.430389
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 33554432 Bytes -> Algo 5 proto 2 time 472.430389
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 33554432 Bytes -> Algo 5 proto 2 time 472.430389
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 33554432 Bytes -> Algo 5 proto 2 time 472.430389
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 33554432 Bytes -> Algo 5 proto 2 time 472.430389
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 33554432 Bytes -> Algo 5 proto 2 time 472.430389
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 33554432 Bytes -> Algo 5 proto 2 time 472.430389
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 33554432 Bytes -> Algo 5 proto 2 time 472.430389
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 33554432 Bytes -> Algo 5 proto 2 time 472.430389
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 33554432 Bytes -> Algo 5 proto 2 time 472.430389
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 33554432 Bytes -> Algo 5 proto 2 time 472.430389
[0] 2249.3 14.92 28.59 0
[0] 67108864 16777216 float sum -1[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 67108864 Bytes -> Algo 1 proto 1 time 842.177856
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 67108864 Bytes -> Algo 1 proto 1 time 842.177856
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 67108864 Bytes -> Algo 1 proto 1 time 842.177856
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 67108864 Bytes -> Algo 1 proto 1 time 842.177856
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 67108864 Bytes -> Algo 1 proto 1 time 842.177856
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 67108864 Bytes -> Algo 1 proto 1 time 842.177856
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 67108864 Bytes -> Algo 1 proto 1 time 842.177856
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 67108864 Bytes -> Algo 1 proto 1 time 842.177856
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 67108864 Bytes -> Algo 1 proto 1 time 842.177856
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 67108864 Bytes -> Algo 1 proto 1 time 842.177856
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 67108864 Bytes -> Algo 1 proto 1 time 842.177856
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 67108864 Bytes -> Algo 1 proto 1 time 842.177856
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 67108864 Bytes -> Algo 1 proto 1 time 842.177856
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 67108864 Bytes -> Algo 1 proto 1 time 842.177856
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 67108864 Bytes -> Algo 1 proto 1 time 842.177856
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 67108864 Bytes -> Algo 1 proto 1 time 842.177856
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 67108864 Bytes -> Algo 1 proto 1 time 842.177856
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 67108864 Bytes -> Algo 1 proto 1 time 842.177856
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 67108864 Bytes -> Algo 1 proto 1 time 842.177856
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 67108864 Bytes -> Algo 1 proto 1 time 842.177856
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 67108864 Bytes -> Algo 1 proto 1 time 842.177856
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 67108864 Bytes -> Algo 1 proto 1 time 842.177856
[0] 24256 2.77 5.30 0[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 67108864 Bytes -> Algo 1 proto 1 time 842.177856
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 67108864 Bytes -> Algo 1 proto 1 time 842.177856
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 67108864 Bytes -> Algo 1 proto 1 time 842.177856
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 67108864 Bytes -> Algo 1 proto 1 time 842.177856
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 67108864 Bytes -> Algo 1 proto 1 time 842.177856
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 67108864 Bytes -> Algo 1 proto 1 time 842.177856
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 67108864 Bytes -> Algo 1 proto 1 time 842.177856
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 67108864 Bytes -> Algo 1 proto 1 time 842.177856
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 67108864 Bytes -> Algo 1 proto 1 time 842.177856
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 67108864 Bytes -> Algo 1 proto 1 time 842.177856
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 67108864 Bytes -> Algo 1 proto 1 time 842.177856
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 67108864 Bytes -> Algo 1 proto 1 time 842.177856
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 67108864 Bytes -> Algo 1 proto 1 time 842.177856
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 67108864 Bytes -> Algo 1 proto 1 time 842.177856
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 67108864 Bytes -> Algo 1 proto 1 time 842.177856
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 67108864 Bytes -> Algo 1 proto 1 time 842.177856
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 67108864 Bytes -> Algo 1 proto 1 time 842.177856
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 67108864 Bytes -> Algo 1 proto 1 time 842.177856
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 67108864 Bytes -> Algo 1 proto 1 time 842.177856
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 67108864 Bytes -> Algo 1 proto 1 time 842.177856
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 67108864 Bytes -> Algo 1 proto 1 time 842.177856
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 67108864 Bytes -> Algo 1 proto 1 time 842.177856
[0] 25233 2.66 5.10 0
[0] 134217728 33554432 float sum -1[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 134217728 Bytes -> Algo 1 proto 1 time 1570.355713
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 134217728 Bytes -> Algo 1 proto 1 time 1570.355713
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 134217728 Bytes -> Algo 1 proto 1 time 1570.355713
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 134217728 Bytes -> Algo 1 proto 1 time 1570.355713
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 134217728 Bytes -> Algo 1 proto 1 time 1570.355713
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 134217728 Bytes -> Algo 1 proto 1 time 1570.355713
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 134217728 Bytes -> Algo 1 proto 1 time 1570.355713
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 134217728 Bytes -> Algo 1 proto 1 time 1570.355713
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 134217728 Bytes -> Algo 1 proto 1 time 1570.355713
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 134217728 Bytes -> Algo 1 proto 1 time 1570.355713
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 134217728 Bytes -> Algo 1 proto 1 time 1570.355713
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 134217728 Bytes -> Algo 1 proto 1 time 1570.355713
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 134217728 Bytes -> Algo 1 proto 1 time 1570.355713
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 134217728 Bytes -> Algo 1 proto 1 time 1570.355713
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 134217728 Bytes -> Algo 1 proto 1 time 1570.355713
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 134217728 Bytes -> Algo 1 proto 1 time 1570.355713
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 134217728 Bytes -> Algo 1 proto 1 time 1570.355713
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 134217728 Bytes -> Algo 1 proto 1 time 1570.355713
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 134217728 Bytes -> Algo 1 proto 1 time 1570.355713
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 134217728 Bytes -> Algo 1 proto 1 time 1570.355713
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 134217728 Bytes -> Algo 1 proto 1 time 1570.355713
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 134217728 Bytes -> Algo 1 proto 1 time 1570.355713
[0] 48957 2.74 5.25 0[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 134217728 Bytes -> Algo 1 proto 1 time 1570.355713
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 134217728 Bytes -> Algo 1 proto 1 time 1570.355713
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 134217728 Bytes -> Algo 1 proto 1 time 1570.355713
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 134217728 Bytes -> Algo 1 proto 1 time 1570.355713
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 134217728 Bytes -> Algo 1 proto 1 time 1570.355713
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 134217728 Bytes -> Algo 1 proto 1 time 1570.355713
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 134217728 Bytes -> Algo 1 proto 1 time 1570.355713
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 134217728 Bytes -> Algo 1 proto 1 time 1570.355713
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 134217728 Bytes -> Algo 1 proto 1 time 1570.355713
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 134217728 Bytes -> Algo 1 proto 1 time 1570.355713
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 134217728 Bytes -> Algo 1 proto 1 time 1570.355713
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 134217728 Bytes -> Algo 1 proto 1 time 1570.355713
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 134217728 Bytes -> Algo 1 proto 1 time 1570.355713
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 134217728 Bytes -> Algo 1 proto 1 time 1570.355713
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 134217728 Bytes -> Algo 1 proto 1 time 1570.355713
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 134217728 Bytes -> Algo 1 proto 1 time 1570.355713
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 134217728 Bytes -> Algo 1 proto 1 time 1570.355713
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 134217728 Bytes -> Algo 1 proto 1 time 1570.355713
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 134217728 Bytes -> Algo 1 proto 1 time 1570.355713
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 134217728 Bytes -> Algo 1 proto 1 time 1570.355713
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 134217728 Bytes -> Algo 1 proto 1 time 1570.355713
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 134217728 Bytes -> Algo 1 proto 1 time 1570.355713
[0] 46412 2.89 5.54 0
[0] 268435456 67108864 float sum -1[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 268435456 Bytes -> Algo 1 proto 2 time 2999.454102
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 268435456 Bytes -> Algo 1 proto 2 time 2999.454102
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 268435456 Bytes -> Algo 1 proto 2 time 2999.454102
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 268435456 Bytes -> Algo 1 proto 2 time 2999.454102
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 268435456 Bytes -> Algo 1 proto 2 time 2999.454102
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 268435456 Bytes -> Algo 1 proto 2 time 2999.454102
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 268435456 Bytes -> Algo 1 proto 2 time 2999.454102
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 268435456 Bytes -> Algo 1 proto 2 time 2999.454102
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 268435456 Bytes -> Algo 1 proto 2 time 2999.454102
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 268435456 Bytes -> Algo 1 proto 2 time 2999.454102
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 268435456 Bytes -> Algo 1 proto 2 time 2999.454102
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 268435456 Bytes -> Algo 1 proto 2 time 2999.454102
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 268435456 Bytes -> Algo 1 proto 2 time 2999.454102
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 268435456 Bytes -> Algo 1 proto 2 time 2999.454102
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 268435456 Bytes -> Algo 1 proto 2 time 2999.454102
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 268435456 Bytes -> Algo 1 proto 2 time 2999.454102
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 268435456 Bytes -> Algo 1 proto 2 time 2999.454102
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 268435456 Bytes -> Algo 1 proto 2 time 2999.454102
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 268435456 Bytes -> Algo 1 proto 2 time 2999.454102
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 268435456 Bytes -> Algo 1 proto 2 time 2999.454102
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 268435456 Bytes -> Algo 1 proto 2 time 2999.454102
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 268435456 Bytes -> Algo 1 proto 2 time 2999.454102
[0] 126387 2.12 4.07 0[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 268435456 Bytes -> Algo 1 proto 2 time 2999.454102
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 268435456 Bytes -> Algo 1 proto 2 time 2999.454102
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 268435456 Bytes -> Algo 1 proto 2 time 2999.454102
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 268435456 Bytes -> Algo 1 proto 2 time 2999.454102
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 268435456 Bytes -> Algo 1 proto 2 time 2999.454102
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 268435456 Bytes -> Algo 1 proto 2 time 2999.454102
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 268435456 Bytes -> Algo 1 proto 2 time 2999.454102
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 268435456 Bytes -> Algo 1 proto 2 time 2999.454102
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 268435456 Bytes -> Algo 1 proto 2 time 2999.454102
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 268435456 Bytes -> Algo 1 proto 2 time 2999.454102
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 268435456 Bytes -> Algo 1 proto 2 time 2999.454102
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 268435456 Bytes -> Algo 1 proto 2 time 2999.454102
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 268435456 Bytes -> Algo 1 proto 2 time 2999.454102
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 268435456 Bytes -> Algo 1 proto 2 time 2999.454102
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 268435456 Bytes -> Algo 1 proto 2 time 2999.454102
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 268435456 Bytes -> Algo 1 proto 2 time 2999.454102
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 268435456 Bytes -> Algo 1 proto 2 time 2999.454102
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 268435456 Bytes -> Algo 1 proto 2 time 2999.454102
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 268435456 Bytes -> Algo 1 proto 2 time 2999.454102
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 268435456 Bytes -> Algo 1 proto 2 time 2999.454102
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 268435456 Bytes -> Algo 1 proto 2 time 2999.454102
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 268435456 Bytes -> Algo 1 proto 2 time 2999.454102
[0] 127194 2.11 4.05 0
[0] 536870912 134217728 float sum -1[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 536870912 Bytes -> Algo 1 proto 2 time 5679.147949
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 536870912 Bytes -> Algo 1 proto 2 time 5679.147949
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 536870912 Bytes -> Algo 1 proto 2 time 5679.147949
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 536870912 Bytes -> Algo 1 proto 2 time 5679.147949
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 536870912 Bytes -> Algo 1 proto 2 time 5679.147949
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 536870912 Bytes -> Algo 1 proto 2 time 5679.147949
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 536870912 Bytes -> Algo 1 proto 2 time 5679.147949
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 536870912 Bytes -> Algo 1 proto 2 time 5679.147949
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 536870912 Bytes -> Algo 1 proto 2 time 5679.147949
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 536870912 Bytes -> Algo 1 proto 2 time 5679.147949
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 536870912 Bytes -> Algo 1 proto 2 time 5679.147949
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 536870912 Bytes -> Algo 1 proto 2 time 5679.147949
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 536870912 Bytes -> Algo 1 proto 2 time 5679.147949
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 536870912 Bytes -> Algo 1 proto 2 time 5679.147949
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 536870912 Bytes -> Algo 1 proto 2 time 5679.147949
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 536870912 Bytes -> Algo 1 proto 2 time 5679.147949
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 536870912 Bytes -> Algo 1 proto 2 time 5679.147949
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 536870912 Bytes -> Algo 1 proto 2 time 5679.147949
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 536870912 Bytes -> Algo 1 proto 2 time 5679.147949
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 536870912 Bytes -> Algo 1 proto 2 time 5679.147949
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 536870912 Bytes -> Algo 1 proto 2 time 5679.147949
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 536870912 Bytes -> Algo 1 proto 2 time 5679.147949
[0] 196435 2.73 5.24 0[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 536870912 Bytes -> Algo 1 proto 2 time 5679.147949
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 536870912 Bytes -> Algo 1 proto 2 time 5679.147949
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 536870912 Bytes -> Algo 1 proto 2 time 5679.147949
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 536870912 Bytes -> Algo 1 proto 2 time 5679.147949
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 536870912 Bytes -> Algo 1 proto 2 time 5679.147949
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 536870912 Bytes -> Algo 1 proto 2 time 5679.147949
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 536870912 Bytes -> Algo 1 proto 2 time 5679.147949
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 536870912 Bytes -> Algo 1 proto 2 time 5679.147949
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 536870912 Bytes -> Algo 1 proto 2 time 5679.147949
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 536870912 Bytes -> Algo 1 proto 2 time 5679.147949
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 536870912 Bytes -> Algo 1 proto 2 time 5679.147949
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 536870912 Bytes -> Algo 1 proto 2 time 5679.147949
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 536870912 Bytes -> Algo 1 proto 2 time 5679.147949
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 536870912 Bytes -> Algo 1 proto 2 time 5679.147949
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 536870912 Bytes -> Algo 1 proto 2 time 5679.147949
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 536870912 Bytes -> Algo 1 proto 2 time 5679.147949
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 536870912 Bytes -> Algo 1 proto 2 time 5679.147949
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 536870912 Bytes -> Algo 1 proto 2 time 5679.147949
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 536870912 Bytes -> Algo 1 proto 2 time 5679.147949
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 536870912 Bytes -> Algo 1 proto 2 time 5679.147949
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 536870912 Bytes -> Algo 1 proto 2 time 5679.147949
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 536870912 Bytes -> Algo 1 proto 2 time 5679.147949
[0] 189960 2.83 5.42 0
[0] 1073741824 268435456 float sum -1[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 1073741824 Bytes -> Algo 1 proto 2 time 11038.536133
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 1073741824 Bytes -> Algo 1 proto 2 time 11038.536133
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 1073741824 Bytes -> Algo 1 proto 2 time 11038.536133
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 1073741824 Bytes -> Algo 1 proto 2 time 11038.536133
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 1073741824 Bytes -> Algo 1 proto 2 time 11038.536133
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 1073741824 Bytes -> Algo 1 proto 2 time 11038.536133
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 1073741824 Bytes -> Algo 1 proto 2 time 11038.536133
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 1073741824 Bytes -> Algo 1 proto 2 time 11038.536133
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 1073741824 Bytes -> Algo 1 proto 2 time 11038.536133
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 1073741824 Bytes -> Algo 1 proto 2 time 11038.536133
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 1073741824 Bytes -> Algo 1 proto 2 time 11038.536133
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 1073741824 Bytes -> Algo 1 proto 2 time 11038.536133
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 1073741824 Bytes -> Algo 1 proto 2 time 11038.536133
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 1073741824 Bytes -> Algo 1 proto 2 time 11038.536133
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 1073741824 Bytes -> Algo 1 proto 2 time 11038.536133
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 1073741824 Bytes -> Algo 1 proto 2 time 11038.536133
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 1073741824 Bytes -> Algo 1 proto 2 time 11038.536133
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 1073741824 Bytes -> Algo 1 proto 2 time 11038.536133
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 1073741824 Bytes -> Algo 1 proto 2 time 11038.536133
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 1073741824 Bytes -> Algo 1 proto 2 time 11038.536133
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 1073741824 Bytes -> Algo 1 proto 2 time 11038.536133
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 1073741824 Bytes -> Algo 1 proto 2 time 11038.536133
[0] 373848 2.87 5.50 0[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 1073741824 Bytes -> Algo 1 proto 2 time 11038.536133
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 1073741824 Bytes -> Algo 1 proto 2 time 11038.536133
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 1073741824 Bytes -> Algo 1 proto 2 time 11038.536133
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 1073741824 Bytes -> Algo 1 proto 2 time 11038.536133
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 1073741824 Bytes -> Algo 1 proto 2 time 11038.536133
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 1073741824 Bytes -> Algo 1 proto 2 time 11038.536133
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 1073741824 Bytes -> Algo 1 proto 2 time 11038.536133
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 1073741824 Bytes -> Algo 1 proto 2 time 11038.536133
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 1073741824 Bytes -> Algo 1 proto 2 time 11038.536133
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 1073741824 Bytes -> Algo 1 proto 2 time 11038.536133
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 1073741824 Bytes -> Algo 1 proto 2 time 11038.536133
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 1073741824 Bytes -> Algo 1 proto 2 time 11038.536133
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 1073741824 Bytes -> Algo 1 proto 2 time 11038.536133
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 1073741824 Bytes -> Algo 1 proto 2 time 11038.536133
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 1073741824 Bytes -> Algo 1 proto 2 time 11038.536133
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 1073741824 Bytes -> Algo 1 proto 2 time 11038.536133
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 1073741824 Bytes -> Algo 1 proto 2 time 11038.536133
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 1073741824 Bytes -> Algo 1 proto 2 time 11038.536133
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 1073741824 Bytes -> Algo 1 proto 2 time 11038.536133
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 1073741824 Bytes -> Algo 1 proto 2 time 11038.536133
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 1073741824 Bytes -> Algo 1 proto 2 time 11038.536133
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 1073741824 Bytes -> Algo 1 proto 2 time 11038.536133
[0] 368024 2.92 5.59 0
[0] 2147483648 536870912 float sum -1[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 2147483648 Bytes -> Algo 1 proto 2 time 21757.312500
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 2147483648 Bytes -> Algo 1 proto 2 time 21757.312500
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 2147483648 Bytes -> Algo 1 proto 2 time 21757.312500
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 2147483648 Bytes -> Algo 1 proto 2 time 21757.312500
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 2147483648 Bytes -> Algo 1 proto 2 time 21757.312500
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 2147483648 Bytes -> Algo 1 proto 2 time 21757.312500
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 2147483648 Bytes -> Algo 1 proto 2 time 21757.312500
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 2147483648 Bytes -> Algo 1 proto 2 time 21757.312500
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 2147483648 Bytes -> Algo 1 proto 2 time 21757.312500
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 2147483648 Bytes -> Algo 1 proto 2 time 21757.312500
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 2147483648 Bytes -> Algo 1 proto 2 time 21757.312500
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 2147483648 Bytes -> Algo 1 proto 2 time 21757.312500
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 2147483648 Bytes -> Algo 1 proto 2 time 21757.312500
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 2147483648 Bytes -> Algo 1 proto 2 time 21757.312500
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 2147483648 Bytes -> Algo 1 proto 2 time 21757.312500
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 2147483648 Bytes -> Algo 1 proto 2 time 21757.312500
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 2147483648 Bytes -> Algo 1 proto 2 time 21757.312500
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 2147483648 Bytes -> Algo 1 proto 2 time 21757.312500
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 2147483648 Bytes -> Algo 1 proto 2 time 21757.312500
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 2147483648 Bytes -> Algo 1 proto 2 time 21757.312500
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 2147483648 Bytes -> Algo 1 proto 2 time 21757.312500
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 2147483648 Bytes -> Algo 1 proto 2 time 21757.312500
[0] 786561 2.73 5.23 0[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 2147483648 Bytes -> Algo 1 proto 2 time 21757.312500
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 2147483648 Bytes -> Algo 1 proto 2 time 21757.312500
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 2147483648 Bytes -> Algo 1 proto 2 time 21757.312500
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 2147483648 Bytes -> Algo 1 proto 2 time 21757.312500
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 2147483648 Bytes -> Algo 1 proto 2 time 21757.312500
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 2147483648 Bytes -> Algo 1 proto 2 time 21757.312500
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 2147483648 Bytes -> Algo 1 proto 2 time 21757.312500
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 2147483648 Bytes -> Algo 1 proto 2 time 21757.312500
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 2147483648 Bytes -> Algo 1 proto 2 time 21757.312500
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 2147483648 Bytes -> Algo 1 proto 2 time 21757.312500
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 2147483648 Bytes -> Algo 1 proto 2 time 21757.312500
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 2147483648 Bytes -> Algo 1 proto 2 time 21757.312500
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 2147483648 Bytes -> Algo 1 proto 2 time 21757.312500
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 2147483648 Bytes -> Algo 1 proto 2 time 21757.312500
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 2147483648 Bytes -> Algo 1 proto 2 time 21757.312500
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 2147483648 Bytes -> Algo 1 proto 2 time 21757.312500
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 2147483648 Bytes -> Algo 1 proto 2 time 21757.312500
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 2147483648 Bytes -> Algo 1 proto 2 time 21757.312500
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 2147483648 Bytes -> Algo 1 proto 2 time 21757.312500
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 2147483648 Bytes -> Algo 1 proto 2 time 21757.312500
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 2147483648 Bytes -> Algo 1 proto 2 time 21757.312500
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 2147483648 Bytes -> Algo 1 proto 2 time 21757.312500
[0] 799579 2.69 5.15 0
[0] 4294967296 1073741824 float sum -1[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 4294967296 Bytes -> Algo 1 proto 2 time 43194.867188
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 4294967296 Bytes -> Algo 1 proto 2 time 43194.867188
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 4294967296 Bytes -> Algo 1 proto 2 time 43194.867188
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 4294967296 Bytes -> Algo 1 proto 2 time 43194.867188
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 4294967296 Bytes -> Algo 1 proto 2 time 43194.867188
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 4294967296 Bytes -> Algo 1 proto 2 time 43194.867188
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 4294967296 Bytes -> Algo 1 proto 2 time 43194.867188
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 4294967296 Bytes -> Algo 1 proto 2 time 43194.867188
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 4294967296 Bytes -> Algo 1 proto 2 time 43194.867188
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 4294967296 Bytes -> Algo 1 proto 2 time 43194.867188
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 4294967296 Bytes -> Algo 1 proto 2 time 43194.867188
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 4294967296 Bytes -> Algo 1 proto 2 time 43194.867188
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 4294967296 Bytes -> Algo 1 proto 2 time 43194.867188
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 4294967296 Bytes -> Algo 1 proto 2 time 43194.867188
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 4294967296 Bytes -> Algo 1 proto 2 time 43194.867188
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 4294967296 Bytes -> Algo 1 proto 2 time 43194.867188
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 4294967296 Bytes -> Algo 1 proto 2 time 43194.867188
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 4294967296 Bytes -> Algo 1 proto 2 time 43194.867188
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 4294967296 Bytes -> Algo 1 proto 2 time 43194.867188
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 4294967296 Bytes -> Algo 1 proto 2 time 43194.867188
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 4294967296 Bytes -> Algo 1 proto 2 time 43194.867188
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 4294967296 Bytes -> Algo 1 proto 2 time 43194.867188
[0] 1549446 2.77 5.31 0[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 4294967296 Bytes -> Algo 1 proto 2 time 43194.867188
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 4294967296 Bytes -> Algo 1 proto 2 time 43194.867188
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 4294967296 Bytes -> Algo 1 proto 2 time 43194.867188
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 4294967296 Bytes -> Algo 1 proto 2 time 43194.867188
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 4294967296 Bytes -> Algo 1 proto 2 time 43194.867188
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 4294967296 Bytes -> Algo 1 proto 2 time 43194.867188
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 4294967296 Bytes -> Algo 1 proto 2 time 43194.867188
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 4294967296 Bytes -> Algo 1 proto 2 time 43194.867188
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 4294967296 Bytes -> Algo 1 proto 2 time 43194.867188
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 4294967296 Bytes -> Algo 1 proto 2 time 43194.867188
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 4294967296 Bytes -> Algo 1 proto 2 time 43194.867188
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 4294967296 Bytes -> Algo 1 proto 2 time 43194.867188
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 4294967296 Bytes -> Algo 1 proto 2 time 43194.867188
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 4294967296 Bytes -> Algo 1 proto 2 time 43194.867188
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 4294967296 Bytes -> Algo 1 proto 2 time 43194.867188
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 4294967296 Bytes -> Algo 1 proto 2 time 43194.867188
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 4294967296 Bytes -> Algo 1 proto 2 time 43194.867188
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 4294967296 Bytes -> Algo 1 proto 2 time 43194.867188
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 4294967296 Bytes -> Algo 1 proto 2 time 43194.867188
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 4294967296 Bytes -> Algo 1 proto 2 time 43194.867188
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 4294967296 Bytes -> Algo 1 proto 2 time 43194.867188
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 4294967296 Bytes -> Algo 1 proto 2 time 43194.867188
[0] 1575896 2.73 5.22 0
[0] 8589934592 2147483648 float sum -1[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 8589934592 Bytes -> Algo 1 proto 2 time 86069.968750
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 8589934592 Bytes -> Algo 1 proto 2 time 86069.968750
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 8589934592 Bytes -> Algo 1 proto 2 time 86069.968750
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 8589934592 Bytes -> Algo 1 proto 2 time 86069.968750
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 8589934592 Bytes -> Algo 1 proto 2 time 86069.968750
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 8589934592 Bytes -> Algo 1 proto 2 time 86069.968750
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 8589934592 Bytes -> Algo 1 proto 2 time 86069.968750
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 8589934592 Bytes -> Algo 1 proto 2 time 86069.968750
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 8589934592 Bytes -> Algo 1 proto 2 time 86069.968750
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 8589934592 Bytes -> Algo 1 proto 2 time 86069.968750
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 8589934592 Bytes -> Algo 1 proto 2 time 86069.968750
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 8589934592 Bytes -> Algo 1 proto 2 time 86069.968750
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 8589934592 Bytes -> Algo 1 proto 2 time 86069.968750
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 8589934592 Bytes -> Algo 1 proto 2 time 86069.968750
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 8589934592 Bytes -> Algo 1 proto 2 time 86069.968750
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 8589934592 Bytes -> Algo 1 proto 2 time 86069.968750
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 8589934592 Bytes -> Algo 1 proto 2 time 86069.968750
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 8589934592 Bytes -> Algo 1 proto 2 time 86069.968750
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 8589934592 Bytes -> Algo 1 proto 2 time 86069.968750
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 8589934592 Bytes -> Algo 1 proto 2 time 86069.968750
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 8589934592 Bytes -> Algo 1 proto 2 time 86069.968750
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 8589934592 Bytes -> Algo 1 proto 2 time 86069.968750
[0] 3063867 2.80 5.37 0[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 8589934592 Bytes -> Algo 1 proto 2 time 86069.968750
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 8589934592 Bytes -> Algo 1 proto 2 time 86069.968750
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 8589934592 Bytes -> Algo 1 proto 2 time 86069.968750
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 8589934592 Bytes -> Algo 1 proto 2 time 86069.968750
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 8589934592 Bytes -> Algo 1 proto 2 time 86069.968750
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 8589934592 Bytes -> Algo 1 proto 2 time 86069.968750
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 8589934592 Bytes -> Algo 1 proto 2 time 86069.968750
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 8589934592 Bytes -> Algo 1 proto 2 time 86069.968750
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 8589934592 Bytes -> Algo 1 proto 2 time 86069.968750
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 8589934592 Bytes -> Algo 1 proto 2 time 86069.968750
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 8589934592 Bytes -> Algo 1 proto 2 time 86069.968750
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 8589934592 Bytes -> Algo 1 proto 2 time 86069.968750
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 8589934592 Bytes -> Algo 1 proto 2 time 86069.968750
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 8589934592 Bytes -> Algo 1 proto 2 time 86069.968750
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 8589934592 Bytes -> Algo 1 proto 2 time 86069.968750
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 8589934592 Bytes -> Algo 1 proto 2 time 86069.968750
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 8589934592 Bytes -> Algo 1 proto 2 time 86069.968750
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 8589934592 Bytes -> Algo 1 proto 2 time 86069.968750
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 8589934592 Bytes -> Algo 1 proto 2 time 86069.968750
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 8589934592 Bytes -> Algo 1 proto 2 time 86069.968750
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 8589934592 Bytes -> Algo 1 proto 2 time 86069.968750
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO 8589934592 Bytes -> Algo 1 proto 2 time 86069.968750
[0] 3300068 2.60 4.99 0
[0] h100clust-worker-1:497888:497888 [0] NCCL INFO comm 0x561097745b80 rank 0 nranks 24 cudaDev 0 busId a4000 - Destroy COMPLETE
[0] # Out of bounds values : 0 OK
[0] # Avg bus bandwidth : 4.01902
[0] #
[0]
Your latest log shows that NCCL chooses the Tree algorithm for small message sizes (<512KB), and then switches to NVLSTree-Simple up to 32MB, which is expected. Somewhat unusually (probably because of RoCE?), it switches to Ring-LL128 for 64MB-128MB, but for 256MB and above it switches to Ring-Simple, which is expected.
You may want to try experimenting by disabling Ring (NCCL_ALGO=^Ring
) and/or enabling just NVLSTree (NCCL_ALGO=NVLSTree
) but, since the log didn't reveal any fundamental problems with algorithm selection, I wouldn't be surprised if that doesn't solve it.
My guess is that you are suffering from some sort of network congestion. Have you tried experimenting with other values of NCCL_CROSS_NIC
, specifically 0
and 1
?
Thank you for your review of the data and suggestions. We've made some network infrastructure changes and are seeing improved performance. I'll get back after we've had more time to study the results.
We are seeing an issue with NCCL allreduce performance that we would appreciate Nvidia's help on.
We have three nodes split across two racks: Two nodes on one rack and one node on another rack. Two-node performance either within a rack or across racks is OK. Three-node performance across racks is severely degraded. We've replicated this on different sets of nodes and racks.
The configuration is as follows:
Three nodes: Two on one rack, and one on another rack
8 x H100 GPUs with NVlink in each node
8 x ConnectX-7 dual-port NICs on each node, with 200 Gb links
Each rack has two top-of-rack (TOR) switches; each NIC's ports are split between the TOR switches; TOR switches are connected with spine switches
Virtualized configuration: Nodes are Ubuntu 22.04 VM's
Module versions:
GPU information is provided in nvidia-smi -q output at the bottom
NCCL version 2.22.3+cuda12.5
NCCL environment variable settings:
NCCL command: all_reduce_perf -b 1 -e 8G -f 2 -g 1 -n 20
Example output from the three-node case is below. Bandwidth at the 8 GB data size is 99% degraded from our two-node case. Degradation is noticeable but less severe at smaller data sizes starting at around 8 KB. We also note that the drop-off in bandwidth going from the 32 MB to 64 MB data size is consistent across executions of the test.
The output of nvidia-smi -q for one GPU is provided below. This was captured with no workload running.