-
Hi all,
What is the command line to run nccl_mpi_all_reduce on a multi-node system (2 nodes with 4 GPUs each one)?, and I am getting the below error when typing this command:
> DeepBench/code$ m…
-
[Provide here general introduction to the feature request, and why it is relevant to this repository]
The community.general.nmcli module supports now infiniband bonding, by configuring a new option…
jpm38 updated
3 months ago
-
2node16 H20 GPU allredcue performance is 343GBps(with NVL SHARP),But theoretically it should be able to reach 460GBps
```
1048576 262144 float sum -1 121.8 8.61 16.1…
-
## Background information
### What version of Open MPI are you using?
* Open MPI v4.1.4
* UCX v1.13.0
* rdma-core-50mlnx1-1.49417.x86_64
### Describe how Open MPI was installed
Built via Spa…
omor1 updated
9 months ago
-
There seem to be some bugs with the openmpi-4.1.4 when it is used with slurm 17.11.7 and intel Omni-path.
My CPUs are Intel Xeon E5-2695v4 (Broadwell Nodes with 15 GB /scratch).
When I use `srun --…
-
## Background information
### What version of Open MPI are you using? (e.g., v3.0.5, v4.0.2, git branch name and hash, etc.)
repo_rev=v4.1.1-30-g535358e937
### Describe how Open MPI was insta…
-
### Host operating system: output of `uname -a`
`Linux hostname 4.14.97#1 SMP Fri Feb 1 14:23:07 EST 2019 x86_64 GNU/Linux`
### node_exporter version: output of `node_exporter --version`
0.17…
-
### Host operating system: output of `uname -a`
```
$ uname -r
3.10.0-957.41.1.el7.x86_64
```
### node_exporter version: output of `node_exporter --version`
```
$ node_exporter --version
…
-
Hi, I understand that yes, OPNsense does have the Infiniband drivers enabled by default, it doesn't seem to have OpenSM and other utilities from the OFED package required to run a Subnet Manager to us…
-
The core RadixSortLSD routine has some start up overheads that can be pretty dramatic, especially at scale. At 75K cores we have seen 20-25 second of start up time. For large sorts this is typically a…