-
### Your current environment
docker images :vllm/vllm-openai:latest
### š Describe the bug
`docker run --runtime nvidia --gpus all -v ~/.cache/huggingface:/root/.cache/huggingface -v /mnt/ddataā¦
-
The CI workflows for DifferentiationInterface are getting bigger and bigger, due to the need to test with a lot of upstream and downstream dependencies.
Since we parallelize the tests to avoid compatā¦
-
I always find that I am missing many things in the requirements, in this case also that it uses nccl, for multiple GPUs, which is not yet available in Windows, (I did not try in WSL due to space issueā¦
-
Trying to obtain per-process GPU metrics using DCGM-exporter
logs from nvhostengine :
```
root@dcgm-exporter-tlb4f:/# 2021-11-23 00:15:28.951 ERROR [82:82] Cannot initialize the hostengine: Error: ā¦
-
**Is your feature request related to a problem? Please describe.**
I'm always frustrated when I have to open another window to monitor the usage of my GPUs.
**Describe the solution you'd like**
Iā¦
-
_The template below is mostly useful for bug reports and support questions. Feel free to remove anything which doesn't apply to you and add more information where it makes sense._
### 1. Issue or fā¦
-
I'm wondering if there is a simple way to set up a configuration for dedicating a single GPU on a multi-GPU system to time-slicing. For example, my use case is that I have some services which are critā¦
-
### š Describe the bug
DDP init call is failing when using subclass of torch.Tensor, same code works with torch.Tensor.
Command to run the code
python test.py --max-gpus 2 --batch-size 512 --epoch ā¦
-
Hi @TaoZhong11 ,
Thanks for this amazing work!
I encountered an error while running the Docker image with data I obtained from https://fcon_1000.projects.nitrc.org/indi/PRIMEdownloads.html. To tā¦
-
NVSHMEM is an implementation of OpenSHMEM for Nvidia GPUs:
https://developer.nvidia.com/nvshmem
https://docs.nvidia.com/hpc-sdk/nvshmem/api/docs/index.html
It is essentially an alternative to Mā¦