-
### Your current environment
The output of `python collect_env.py`
```text
Your output of `python collect_env.py` here
```
vllm 0.5.4
### 🐛 Describe the bug
目前在8 * A800上进行推理,vl…
-
Hi,
I was wondering if it makes sense to set the NCCL_Algo=Tree while performing the all2all test?
Thanks,
-
### Description
Currently the `TorchTensorType(transport="nccl")` hint only changes behavior for compiled graphs. Non-compiled graphs cannot execute p2p or allreduce NCCL operations. The performance …
-
We have a server with 8 H100 GPU with cuda version 12.6 and nccl version 2.23.4.
When we are running nccl test as per the command provided in - https://github.com/nvidia/nccl-tests we are facing belo…
-
Hi,
We are trying to run 4 vms in a host with 8 H100s, and each vm with 2 GPUs.
We found that the nvswitches can only be passthroughed into a single vm, and the rest vms got none. In this case, vms wi…
-
### Your current environment
The output of `python collect_env.py`
```text
Collecting environment information...
PyTorch version: 2.6.0.dev20240930+cu124
Is debug build: False
CUDA used to b…
-
The test case `17. SM-modelparallelv2`, uses a custom pytorch binaries `pytorch="2.2.0=sm_py3.10_cuda12.1_cudnn8.9.5_nccl_pt_2.2_tsm_2.3_cuda12.1_0` which declared dependency on `aws-ofi-nccl >=1.7.1,…
-
https://github.com/NVIDIA/nccl/issues/688
-
### Describe your problem
Hi,
I have just bought a new computer with 4GPU and the VRAM is large enough to run some very large LLM locally like Mistral Large. I'm running backend server with LM St…
-
### Your current environment
The output of `python collect_env.py`
```text
Your output of `python collect_env.py` here
```
### Model Input Dumps
model = LLM("DeepSeek-Coder-V2-Lite-Bas…