nccl Search Results - Githubissues

1000+ results
for nccl

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

NVIDIA/nccl #355

NCCL receive bug when NCCL_NTHREADS=64

I tried to run the test_nccl.py code in [ my own repo](https://github.com/MonicaGu/NCCLCommunication). This repo consists codes that provide a Python API of NCCL to send or receive PyTorch tensors. I …

gudiandian updated 4 years ago
2
ukri-excalibur/excalibur-tests #326

Add new benchmark: Nvidia HPCG

The Nvidia HPCG benchmark used to available only via containers, but it was open-sourced last week and it's available at https://github.com/NVIDIA/nvidia-hpcg. This works on both CPU (x86-64 and aarc…

giordano updated 1 month ago
6
vosen/ZLUDA #130

ubuntu22.04 ,AMD gpu,pytorch build,NCCL

hello, > I would like to ask a PyTorch question,I used Ubuntu 22.04, AMD GPU, ZLUDA, and I found that the compilation of pytorch did not use zluda/target/release. So how does zluda work? These expor…

liuyang6055 updated 8 months ago
6
NVIDIA/nccl #680

NCCL NAT

Hi Folks, I was trying to understand why NCCL doesn't negotiate port, i.e., nat transparency. If one instance inside a docker, let's say port translated 54321. i.e -p 54321:54321 On another host…

spyroot updated 2 years ago
2
NVIDIA/nccl #1304

NCCL fallback to Ring,LL on broadcast perf and NCCL_ALGO=Tre…

Hi, we recently observed that when running with NCCL_ALGO=Tree,NCCL_PROTO=Simple. NCCL fallback to Ring,LL with broadcast. It seems like NCCL_PROTO is ignored when there is no ALGO/PROTO pair found fo…

arttianezhu updated 5 months ago
1
bytedance/HLLM #12

NCCL error

![fig](https://github.com/user-attachments/assets/80398e7f-975b-4de1-9c9b-ff85633a5d77) code/overall/LLM_deepspeed.yaml, train_batch_size and eval_batch_size both set 1 NCCL error for single gpu, do…

walegahaha123 updated 1 week ago
2
open-mmlab/mmdetection #10126

RuntimeError: NCCL Error 1: unhandled cuda error

when i use dino config to test with pt1.13+mmcv 2.0.0, i got this error

shixiaotong123 updated 8 months ago
1
which47/LLMCL #2

爆显存咨询

想知道这个微调大概需要多少显存呢，我用了6张4090，但还是爆显存了，能帮忙看下问题吗。我的运行脚本长以下这样： data_path='./data_files' model_name_or_path='/data2/hugo/lin_rany/model/Meta-Llama-3-8B-Instruct' export NCCL_P2P_DISABLE=1 export NCCL_IB_…

lin-rany updated 3 months ago
2
togethercomputer/OpenChatKit #102

Token indices sequence length is longer than the specified m…

**Describe the bug** Running the Pythia-7B fine-tune script on 4 x A10 (24GB each). Seems like issue with seq len: _``` Token indices sequence length is longer than the specified maximum seque…

tginart updated 3 months ago
3
ray-project/ray #48237

[core][compiled graphs] Allow users to specify control depen…

### Description ![image](https://github.com/user-attachments/assets/aec7915a-176e-4290-a002-d6e048bcff9a) A, B, C, and D are different actors, and all data between these four actors is transferr…

kevin85421 updated 2 weeks ago
5

上一页 1...20 21 22 23 24 25 26...100 下一页

1000+ results for nccl

1000+ results
for nccl