-
I am running an MPI + CUDA HPC code on a system using `Open PBS` with multiple nodes, each node having 8 NVIDIA GPUs.
On a `SLURM` cluster, I can use "pinning" in order to assign the MPI
rank to …
-
The failure on #439 is due to CUDA synchronization issues, From what I understand, the changes in https://github.com/JuliaGPU/CUDA.jl/pull/395 mean that streams are no longer globally synchronized. Si…
-
**Describe the bug**
I follow the instructions to install MuJoCo_py but got the gcc compiling error. I guess it might be due to gcc version problem but changing versions does not help at all.
**Er…
-
### Steps to reproduce
I installed gromacs@2023.5
```console
$ spack install gromacs@2023.5+cuda~mpi cuda_arch=61,70,75,80,86 %gcc@9.4.0 ^cuda@12.2.1 ^intel-oneapi-mkl
```
### Error message
The …
-
I've tried this on two different systems (cluster and desktop) now. I'm finding that Yank hangs when creating the _second_ cached context object. Have you seen anything like this before?
```
[kyleb…
-
**Related**
https://github.com/conda-forge/heat-feedstock/pull/15
**Feature functionality**
Verify that everything runs as expected with openmpi 5. So far, we're testing everything on 4.1.x
…
-
Hi NCCL team, thanks for your work on this great library.
This issue is in regard to a section in the documentation "[Inter-GPU Communication with CUDA-aware MPI](https://docs.nvidia.com/deeplearni…
-
I have built openmpi 4.0.1 against UCX 1.5.2 (and also 1.6) and get segmentation faults in libucs in mpi4py when it is compiled against this MPI. Here are my configure flags for UCX and OpenMPI:
``…
-
### System Info
- CPU architecture: x86_64
- GPU: A10G
- TensorRT-LLM version: v0.9.0 and building from main branch
- Container: dku-exec-base (AlmaLinux8 based)
- NVIDIA driver version: 535.161.…
-
CUDA-QUANTUM version: 0.8.0
OS: Amazon Linux 2023 x86_64
Looking the basic C++ GHZ script:
```c++
#include
template
struct ghz {
auto operator()() __qpu__ {
cudaq::qarray q;
h(q[0]);
…