-
help!how to solve this problem:
ERROR: Invalid requirement: '_libgcc_mutex=0.1=conda_forge': Expected package name at the start of dependency specifier
_libgcc_mutex=0.1=conda_forge
^ (from…
-
I am not sure why building the demos related to GPU failed on my Windows machine. Would you mind helping me debug this issue?
I updated the submodules. I also attached part of the CMake configuration…
-
### System Info
CPU architecture: x86_64
Host RAM: 1TB
GPU: 8xH100 SXM
Container: Manually built container with Dockerfile.trt_llm_backend
TensorRT-LLM version: 0.14.0.dev2024091000
Driver Version: 5…
-
## Docker File Error
``` sh
root@1dd007c03d48:/# horovodrun --gloo -np 1 -H localhost:1 python horovod/examples/pytorch/pytorch_mnist.py
[0]:/usr/local/lib/python3.8/dist-packages/torch/cuda/__i…
-
## Background information
### What version of Open MPI are you using? (e.g., v4.1.6, v5.0.1, git branch name and hash, etc.)
v5.0.2
### Describe how Open MPI was installed (e.g., from a sourc…
-
I've tried this on two different systems (cluster and desktop) now. I'm finding that Yank hangs when creating the _second_ cached context object. Have you seen anything like this before?
```
[kyleb…
-
Cliff has made some valid suggestions for information which should be included in the documentation re: MPI
+ Mapping GPU devices to MPI processes
+ Using mpirun to correctly place MPI processes …
-
**Describe the bug**
CUDA-aware MPI multi-GPU test (available [here](https://gist.github.com/luraess/ed93cc09ba04fe16f63b4219c1811566)) fails by returning the following error message:
```
.cluster.3…
-
Hi,
I am having problems using GPU direct RDMA in this simple MPI + CUDA example and these are the modules I am using
gcc/8.4.0, cuda/11.1.0, openmpi/4.0.5 and boost/1.73.0
```
#include
…
-
Dear team,
I'm getting the following error when I run Score-P with a module for tracing python scripts:
**************************
2020-10-20 09:24:14.149317: E tensorflow/stream_executor/cuda/…