-
```
[root@SCSP00596 k8s-rdma-device-plugin]# ./bin/k8s-rdma-device-plugin -master ens2f0
I0520 10:03:24.135872 30847 main.go:31] Fetching devices.
I0520 10:03:24.137327 30847 main.go:39] No dev…
-
I have a fresh of install of nixos on my surface and tried to add the surface-pro-intel build to my system through the guidelines.
```
imports =
[ # Include the results of the hardware scan.…
-
```
What steps will reproduce the problem?
1. run "dnet intf show" on system with IB interface
What is the expected output?
Something similar to output from "ip addr show".
3: ib0: mtu 2044 qd…
-
Created a container with the ubuntu 18.04 image and using the rdma-shared device plugin, inside the container when running `ib_write_bw` it reports bellow error, but with ubuntu 20.04/22.04 it works w…
-
### 🐛 Describe the bug
TL;DR: I need to monkey patch `torchrun` to be able to handle a `--rendezvous_endpoint` with a format different from `socket.gehostname()`.
# Description
I have a very …
-
## CVE-2019-11599 - High Severity Vulnerability
Vulnerable Library - linuxlinux-4.19.313
The Linux Kernel
Library home page: https://mirrors.edge.kernel.org/pub/linux/kernel/v4.x/?wsslib=linux
Fou…
-
Thank you for taking the time to submit an issue!
## Background information
### What version of Open MPI are you using? (e.g., v3.0.5, v4.0.2, git branch name and hash, etc.)
v4.1.4
### De…
-
I am trying provision RHEL 8 via Infiniband network using xCAT. Provisioning is getting stuck after loading OS profile and screenshot mentioned below:
![image](https://user-images.githubusercontent…
-
@janekmi @ldorau @osalyk @grom72
I used the following steps to build rpma based on the latest rdma-core:
```
git clone https://github.com/linux-rdma/rdma-core.git
cd rdma-core
mkdir build && c…
-
Now I'm trying to build pytorch from source for my cpu-cluster with backend gloo.
After installing pytorch, I got this information from install summay:
```
-- USE_DISTRIBUTED : True
-- …