nccl Search Results - Githubissues

1000+ results
for nccl

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

vllm-project/vllm #7775

[Bug]: llama3-405b-fp8 NCCL communication

### Your current environment The output of `python collect_env.py` ```text Your output of `python collect_env.py` here ``` vllm 0.5.4 ### 🐛 Describe the bug 目前在8 * A800上进行推理，vl…

wangwensuo updated 4 weeks ago
8
NVIDIA/nccl-tests #246

NCCL_Algo=Tree

Hi, I was wondering if it makes sense to set the NCCL_Algo=Tree while performing the all2all test? Thanks,

afattaholman updated 3 months ago
1
ray-project/ray #48272

[core][compiled graphs] Support NCCL operations in non-compi…

### Description Currently the `TorchTensorType(transport="nccl")` hint only changes behavior for compiled graphs. Non-compiled graphs cannot execute p2p or allreduce NCCL operations. The performance …

stephanie-wang updated 3 days ago
1
NVIDIA/nccl-tests #264

Test CUDA failure common.cu:941 'system not yet initialized'

We have a server with 8 H100 GPU with cuda version 12.6 and nccl version 2.23.4. When we are running nccl test as per the command provided in - https://github.com/nvidia/nccl-tests we are facing belo…

vijayaramaraju-kalidindi updated 4 days ago
2
NVIDIA/nccl-tests #260

How to run nccl test in vm without nvswitch passthroughed?

Hi, We are trying to run 4 vms in a host with 8 H100s, and each vm with 2 GPUs. We found that the nvswitches can only be passthroughed into a single vm, and the rest vms got none. In this case, vms wi…

joydchh updated 1 week ago
1
vllm-project/vllm #9046

[Bug]: 'invalid argument' Error with custom_all_reduce when …

### Your current environment The output of `python collect_env.py` ```text Collecting environment information... PyTorch version: 2.6.0.dev20240930+cu124 Is debug build: False CUDA used to b…

Luosuu updated 3 days ago
1
aws-samples/awsome-distributed-training #457

17.SM-modelparallelv2 uses pytorch binary that depends on de…

The test case `17. SM-modelparallelv2`, uses a custom pytorch binaries `pytorch="2.2.0=sm_py3.10_cuda12.1_cudnn8.9.5_nccl_pt_2.2_tsm_2.3_cuda12.1_0` which declared dependency on `aws-ofi-nccl >=1.7.1,…

junpuf updated 1 month ago
1
lix19937/tensorrt-insight #58

cudaMemcpy consuming CPU resources ?

https://github.com/NVIDIA/nccl/issues/688

lix19937 updated 2 days ago
1
infiniflow/ragflow #2458

[Question]: Multi GPU {NCCL Error 1: unhandled cuda error (r…

### Describe your problem Hi, I have just bought a new computer with 4GPU and the VRAM is large enough to run some very large LLM locally like Mistral Large. I'm running backend server with LM St…

Alamkf updated 1 month ago
2
vllm-project/vllm #9369

[Bug]: cannot run model when TP>1 (already run debug file)

### Your current environment The output of `python collect_env.py` ```text Your output of `python collect_env.py` here ``` ### Model Input Dumps model = LLM("DeepSeek-Coder-V2-Lite-Bas…

jli943 updated 3 weeks ago
2

上一页 1...1 2 3 4 5 6 7...100 下一页

1000+ results for nccl

1000+ results
for nccl