issues
search
NVIDIA
/
nccl
Optimized primitives for collective multi-GPU communication
Other
3.27k
stars
826
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
nvidia-peermem nv_get_p2p_free_callback:127 ERROR detected invalid context, skipping further processing
#1474
hmeScaler
closed
1 month ago
2
Why tree algorithms are specifically targeted at All-Reduce?
#1473
jxh314
opened
1 month ago
1
ncclCommSplit in non-blocking API mode
#1472
kwen2501
opened
1 month ago
6
mpirun nccl capture fails in nccl2.22.3 but suceess in nccl2.19.3
#1471
freshduer
opened
1 month ago
0
NCCL Collective Log Query
#1470
gjit-juniper
opened
1 month ago
2
Unstable busbw for data size 67108864B in nccl test
#1469
szhengac
opened
1 month ago
0
Topology printing bug in `ncclTopoPrintGraph` in NCCL 2.23?
#1468
ezhang887
opened
1 month ago
4
Enroot
#1467
dobiup
opened
1 month ago
1
[BUG]: NCCL_SHM_DISABLE flag is not working
#1466
priyanshu891
opened
1 month ago
1
nccl-test An error occurs when multiple IB_Hcas are used
#1465
yalbaba
opened
1 month ago
0
NCCL GPU affinity (nvidia-smi topo -m) on VM with fail PIX,PHB, PXB GDRDMA. but good performance on BM.
#1464
dobiup
opened
1 month ago
4
Why are not all SMs active when NCCL kernel and compute kernel overlap?
#1463
yu-depend
opened
2 months ago
0
Read abortFlag on the first spin in checkAbort
#1462
igozali
closed
2 months ago
3
NCCL Topology File, Rank Ordering, and Determinism
#1461
asjaffe
opened
2 months ago
0
[SHARP] Error When Running nccl-tests with multi-GPUs per node using SHARP
#1460
nariaki3551
opened
2 months ago
0
Can we change the number of proxy threads
#1459
ZhiyiHu1999
opened
2 months ago
0
How to estimate the communication time of NCCL alltoallv?
#1458
TarzanZhao
opened
2 months ago
0
Some questions about fifo buffer design
#1457
fanpig123
opened
2 months ago
0
[Question] Why ncclSend is non-blocking?
#1456
YanjieGao
opened
2 months ago
2
[SHArP] about the intranode allreduce performance with SHArP
#1455
shh2000
opened
2 months ago
0
300node 8GPU 4 IB NCCL TEST
#1454
gim4moon
opened
2 months ago
4
Poor NCCL allreduce performance
#1453
twichell
opened
2 months ago
4
Allow default alignment less then 16.
#1452
PatriosTheGreat
opened
2 months ago
0
Fix clang missing braces warning
#1451
PatriosTheGreat
opened
2 months ago
0
A Question about network buffer
#1450
ZhiyiHu1999
closed
4 weeks ago
3
Unable to Specify CUDA Stream for Collective Operations Using with torch.cuda.stream() context
#1449
nariaki3551
closed
2 months ago
0
Build failure on nccl 2.23.4. Missing shmutils.h
#1448
mhuguesaws
closed
2 months ago
8
NCCL socket performance over multiple NICs
#1447
iojw
opened
2 months ago
2
[ext-net] is bundling headers still recommended?
#1446
aws-nslick
opened
2 months ago
1
cuda memcpy instead of gpu kernel in p2p sendrecv operation
#1445
vvmex
opened
2 months ago
11
How can I see the algorithm chosen by NCCL?
#1444
Eevan-zq
opened
2 months ago
2
【the difference between NCCL and cudaMemcpyPeerAsync】
#1443
SunNy820828449
opened
2 months ago
0
NVLink SHARP Performance on AWS P5
#1442
Zha0q1
opened
2 months ago
0
Is there any benchmark of P2P communication between NCCL and UCX(ucp)?
#1441
MoFHeka
opened
2 months ago
2
Why NCCL P2P(send/recv) operators need a datatype parameters?
#1440
MoFHeka
closed
2 months ago
3
ALLREDUCE timeout
#1439
THEWEAKEST
opened
2 months ago
10
ALLREDUCE timeout
#1438
THEWEAKEST
closed
2 months ago
0
Allreduce timeout
#1437
THEWEAKEST
closed
2 months ago
0
CatArrayBatchedCopy can't overlap with AllGather
#1436
JuiceLemonLemon
opened
2 months ago
2
Bandwidth is different for GPU 0,1 and GPU 6,7
#1435
JuiceLemonLemon
closed
2 months ago
3
it supports fast failure when RDMA write fails,
#1434
alpha-baby
opened
2 months ago
1
How to handle comp-comm overlapping?
#1433
chenhongyu2048
opened
2 months ago
6
Why are not all SMs active when NCCL kernel and compute kernel overlap?
#1432
yu-depend
opened
2 months ago
0
Can I use nccl comm kernel as a persistent kernel?
#1431
chenhongyu2048
closed
2 months ago
0
Potential Issue with blockIdx.x and channelcount in common.h
#1430
Qizhi697
opened
2 months ago
1
Some question about NVLS and MNNVL.
#1429
shanleo2024
opened
2 months ago
4
Why the kernel duration time was affected by NCCL kernel?
#1428
JuiceLemonLemon
closed
2 months ago
4
net_ib: return ncclSuccess if read roceTypePath failed and errno is E…
#1427
limu713
opened
2 months ago
3
[Question] Why are thread affinities not set for nccl proxy threads?
#1426
joerowell
closed
2 months ago
3
[SHARP] Aggregation Manager Fails: Local Port validation failed
#1425
nariaki3551
closed
2 months ago
1
Previous
Next