-
PPO training returns nan when using multiple GPU. Forcing t use one GPU works fine. I just ran the exactly same code in training code in [Brax Training](https://colab.research.google.com/github/googl…
-
Hi, I am very interested in your work on PipeInfer!
However, the current implementation does not seem to support multiple GPUs. Are there any upcoming plans or suggestions for integrating support for…
-
### Search before asking
- [X] I have searched the Ultralytics YOLO [issues](https://github.com/ultralytics/ultralytics/issues) and [discussions](https://github.com/ultralytics/ultralytics/discussion…
-
As described here: https://huggingface.co/docs/diffusers/en/training/distributed_inference#pytorch-distributed
Has it been tested? I'm wondering what the best way to do this is. Any suggestions / p…
-
I'm trying to replace the cpu index by a gpu one but can't seem to do it on a distributed context.
Faiss version:
faiss 1.8.0 pypi_0 pypi
faiss-gpu …
-
### Your current environment
```text
(vllm) nd600@PC-7C610BFD7B:~$ python collect_env.py
Collecting environment information...
/home/nd600/miniconda3/envs/vllm/lib/python3.10/site-packages/torch…
-
I'm trying to compile this for an AMD 6900XT. On an AMD CPU, arch linux. Pytorch works properly with ROCM.
Here are some of the link issues I'm getting:
```
/usr/bin/ld: CMakeFiles/ctranslate2.d…
-
Single GPU is OK, System hangs when I use multiple GPUs. Can someone help solve this? Thanks.
python build.py --model_dir meta-llama/Llama-2-7b-chat-hf \
--dtype float16 \
…
-
How can I use multiple gpus when doing Classification task?
-
Everything works with 1 GPU and `num_workers` > 0, but if the number of GPUs is set to > 1 I get this error:
```
RuntimeError: Expected all tensors to be on the same device, but found at least two d…