-
## CUDA Out of Memory error but CUDA memory is almost empty
I am currently training a lightweight model on very large amount of textual data (about 70GiB of text).
For that I am using a machine on…
-
binutils is currently taken from the system by default, even when using a compiler that lives under Spack. This is a bit strange given how important it is when building.
### Rationale
To make …
-
### Steps to reproduce the issue
Hello, I'm using the following environment (simplified) declaring an external package for MPI and `py-horovod`. The latter fails building out because of `missing MP…
-
Hello,
I'm trying to build libexpat (v2.5.0) using Spack ([this recipe](https://github.com/spack/spack/blob/63a5cf78acf2fd2c8e2addca4acb2ded7869d878/var/spack/repos/builtin/packages/expat/package.p…
-
Hello :)
**Describe the bug**
I'm trying to use RDMA to transfer a remote CPU variable to a local variable living in CUDA memory. First of all, is that use case supported? More generally, are th…
-
### Steps to reproduce the issue
Hello,
My installation of `py-scipy` is failing with the following message: ` scipy/meson.build:134:7: ERROR: Dependency "mkl-dynamic-lp64-seq" not found, tried pk…
-
### What happened?
Hello,
I'm trying to use Kepler now on a machine with access to the counters. But it seems not to be working. On my VMs, I can see it working with the estimations, but now that …
-
On a system with P100 GPUs, with the following Spack environment (where I explicitely set `cuda_arch=60` for some packages):
```yaml
spack:
config:
install_tree: /my-spack/spack
build…
-
### Describe the bug
Hi,
In the Librispeech recipe, I have tried training the Conformer ASR without CTC loss. So I set the ctc weight to 0 in the parameter file:
```
ctc_weight: 0.0
grad_accumu…
-
The HIP package seems to only support AMD platform, on NVIDIA platform the HIP tools don't work correctly (because they are set up for AMD).
Before installing the hip package through spack, I made …