-
Hello,
is it possible to install monai-label on kubernetes?
I'm trying to make it work via a classic deployment but I'm encountering the following problems in my pods:
[k8sgpu-01|monai-dev] im…
ism93 updated
2 months ago
-
### What happened?
Unable to collect GPU metrics for relevant pods when using passthrough mode.
For example, dcgm-exporter does not collect metrics when a VM created with kubevirt mounts a GPU in …
-
Original issue https://gitlab.freedesktop.org/hadess/switcheroo-control/-/issues/39
---
To allow consumers to make a more educated decision about what GPU to choose Switcheroo-control could expo…
-
**Describe the bug**
when a k8s-manager does not have a GPU Omnia will not deploy the `k8s-device-plugin`. We need to inspect the entire inventory for GPUs before deploying the plugin. I suggest we …
j0hnL updated
5 months ago
-
To run LLaMA 3.1 (or similar large language models) locally, you need specific hardware requirements, especially for storage and other resources. Here's a breakdown of what you typically need:
### …
-
**Describe the bug**
Running the text classification example's ag news training step on multiple discrete GPUs fails with "Shader validation error":
- [stacktrace.log](https://github.com/tracel-ai…
-
Hi,
I am trying to compare mscclpp and nccl on the allreduce preformance. I used the following script to get the performance metrics for mscclpp and nccl respectively.
For mscclpp:
```
log_dir="logs"
…
-
Follow the install instruction as :
PyTorch 2.0 + CUDA 11.8
```
conda create -n sparsebev python=3.8
conda activate sparsebev
conda install pytorch==2.0.0 torchvision==0.15.0 pytorch-cuda=11.8 -c…
-
### Is there an existing issue for this?
- [X] I have searched the existing issues
### Is your issue described in the documentation?
- [X] I have read the documentation
### Is your issue p…
-
**Describe the bug**
Current docker images with tags "main", "1.0.1", and "1.0.0" crash when training.
RuntimeError: CUDA error: no kernel image is available for execution on the
device
**To…