Open shengode503 opened 8 months ago
Enabling ACS on the PCI switch is going to hurt performance since all traffic will have to go back to the root complex. You should first disable ACS on the PCI switch, then run perftests and NCCL tests baremetal, check you get the right performance.
Then, you can re-enable ACS, enable ATS in the NIC, and see if you can get the full performance inside the VM.
Hi Sylvain,
Thanks for the support! We've re-done the perftest
on bare metal per the recommendations(disabled ACS). The attachment is the log, and the snapshot is the nvidia-smi topo. Currently, we're preparing the results of nccl-test(bare-metal/vm)
and perftest(vm)
. Will update the results as soon as possible. Thanks!
Best regards, Kevin
Hi Sylvain,
Here are the additional results we collected. We did the experiments with three different settings. All the logs are in the attachments(test-logs_0315.zip). Please help us to check it. Thanks!
Currently, We think the bare-metal results are normal. However, the others are lower than the expectation. Could you please help us check if the KVM configuration and the PCI topo that we used are correct? Also, what is the recommended command to execute the NCCL test? Below is the command that we currently use. Thanks
mpirun \
-x NCCL_DEBUG=INFO \
-x NCCL_IB_HCA=mlx5 \
-x NCCL_SOCKET_IFNAME=<ifname> \
-x NCCL_IB_MERGE_VFS=0 \
--bind-to none \
-np 16 -host "<node1-ip>:8,<node2-ip>:8" ./all_reduce_perf -b 8 -f 2 -e 8G -g 1
Best regards, Kevin
Everything I know is in my comment above. Unfortunately, I'm not expert at debugging PCIe config and VM hypervisor setup.
Hi @shengode503, I was wondering how your tests look today. Also, have you tested enabling ACS and ATS cases as @sjeaugey suggested above?
We have the exact same issues here. We got the best result from bare metal without ACs, but we hit an issue when we tried to enable both ACS and ATS.
Hi,
Firstly, appreciate publishing the open-source tool and the great support!! Currently, We encountered a lack performance issues while running the NCCL Test in the KVM environment on dual-node. The performance is significantly lower than the expectation. Please advise us on how to improve it. Thanks
[System] System: 2x Supermicro SYS-420GP-TNAR+ CPU: node1: Intel(R) Xeon(R) Platinum 8360Y CPU @ 2.40GHz node2: Intel(R) Xeon(R) Platinum 8358 CPU @ 2.60GHz
GPU: 8x NVIDIA A100-SXM4-80GB (per node) IB cards: 1xMCX623106AC-CDAT, 1xMCX653106A-HDAT Ubuntu: 22.04
[KVM] QEMU/Hypervisor: 6.2.0 KVM configuration file: (attachment, kvm-cfg.xml) IB card: 2 cards, 2 ports per card, 1 VF per port, total 4 NICs
[Software in KVM] NVIDIA CUDA: 12.3 NVIDIA Driver: 545.23.08 NVIDIA MLNX Driver: 5.8-4.1.5.0 NVIDIA Fabric Manager: 545.23.08 NVIDIA NCCL: v2.20.3 UCX: v1.14.0 OpenMPI: v4.1.6 The GPUDirect has been enabled through:
sudo modprobe nvidia-peermem
[NVIDIA Perftest] The Perftest has been done to evaluate the performance in KVM. Howerver, the performace is lower than the expectation. A single IB card(dual-port – 2 VFs) has been passthrough to the KVM. The
ib_write_bw
has been done with all the GPU and the IB devices. Without any tuning, the performance we got is around 85 Gb/s. (attachment, perftest.zip)[NVIDIA NCCL Test] The NCCL Test has been done on both bare metal and KVM.
The performance of 1 IB card(2 ports per card, 1 VF per port, total 2 NICs) is around 17 GB/s. (theoretical performance: 24 GB/s, we got ~=21 GB/s on bare metal) The performance of 2 IB cards(2 ports per card, 1 VF per port, total 4 NICs) is around 27 GB/s. (theoretical performance: 48 GB/s)
[BIOS configuration]
[KVM lspci topo]
[nvidia-smi topo]
[IB VFs] perftest_logs.zip
Best regards, Kevin