-
Thank you for taking the time to submit an issue!
## Background information
While reviewing another PR, I noticed that several routines (mca_coll_cuda_scan, mca_coll_cuda_reduce, mca_coll_cuda_all…
-
## Issue description
When compiling the `v2.3.0` checkout using the below (can't attach) `build-libtorch-2.3.0.sh` script on WSL1 Ubuntu 22.04 with GCC 11.3.0, where `BUILD_BINARY` CMake flag is se…
-
CUDA programming , which is essential for ML/AI optimization, is incredibly sought in the ML industry especially as we entered the LLM era. In order to make the neural network training faster and more…
-
### Describe the bug
While for MPI_Send/recv it is very understandable that self-communication is slow, this is a very common and understandable pattern for Alltoall. Still, self communication seem…
-
Thank you for taking the time to submit an issue!
## .dylib warning from dlopen() for libcuda on linux
### Open MPI v3.1.3
### Open MPI was compiled with PGI 19.1 compilers from a source tarb…
-
I get a segmentation fault with some MPI primitives using cuda-enabled mpi. The issue seems to appear when xla is not initialized, as the error disappears if memory is allocated on the GPU before mpi4…
-
## Background information
I'm trying to run Open-MPI with Horovod and it's breaking during MPI_Init(). I think it's something to do with pmi.
### What version of Open MPI are you using? (e.g., v…
-
There is the compile time seleciton mechanism for the precision in the code base, however the MPI modules are currently hardcoded for `MPI_DOUBLE_PRECISION`. We need to figure out a way to fix this so…
-
## Background information
The function of MPI_Comm_Spawn with multiple MPI processes can lead to an error
### What version of Open MPI are you using? (e.g., v3.0.5, v4.0.2, git branch name and h…
-
### Required prerequisites
- [X] Search the [issue tracker](https://github.com/NVIDIA/cuda-quantum/issues) to check if your feature has already been mentioned or rejected in other issues.
### Descri…