-
**What would you like to be added**:
Maybe a `limits` field that can be added into `ResourceGroup` struct for `ClusterQueue` or the new `Cohort` CRD.
Like this:
```yaml
apiVersion: kueue…
-
**Describe the bug**
1. GPU device **not** found in running RapidsAI Docker container in WSL
2. `nvidia-smi` **can** see the device
2.1. in Windows
2.2. in WSL
2.3. from within the runnin…
-
Running a [`3.3.5-3.4.0` exporter ](https://github.com/NVIDIA/dcgm-exporter/releases/tag/3.3.5-3.4.0) on a 3.3.5 host-engine as shipped via nvidia-ubuntu-repos SEGFAULTs the Host-engine.
Is there s…
-
### Check for existing issues
- [X] Completed
### Describe the bug / provide steps to reproduce it
Put simply, Zed doesn't respect the FlatConfig when formatting Vue files.
Weirdly enough, it work…
-
### System Info
trl, transformers: most recent on github
python 3.10.11
ubuntu 22
package versions:
```
accelerate==1.0.1
addict==2.4.0
aiohappyeyeballs==2.4.3
aiohttp==3.10.10
aiosignal…
-
### NVIDIA Open GPU Kernel Modules Version
525.85.05
### Does this happen with the proprietary driver (of the same version) as well?
Yes
### Operating System and Version
Linux Mint 21.1…
-
**Not related to local or staging**
Same issue as in #1161
In short, we have
```
NVIDIA-SMI has failed because it couldn't communicate with the NVIDIA driver. Make sure that the latest NVIDIA …
-
We have 2 H200 servers connected with the IP switch. We ran nccl_test and all_reduce_perf script worked well and had expected performance on the baremetal system.
```
fs@fs-207:~$ mpirun -np 16 -H 20…
-
### Your current environment
The output of `python collect_env.py`
```text
Your output of `python collect_env.py` here
```
Defaulted container "kserve-container" out of: kserve-container,…
-
### What is the issue?
It seems that OLLAMA_MAX_QUEUE is not taking effect. My environment is Windows 11, and I have set OLLAMA_NUM_PARALLEL=1,
set OLLAMA_MAX_QUEUE=1, but excessive requests are sti…