issues
search
NVIDIA
/
nccl-tests
NCCL Tests
BSD 3-Clause "New" or "Revised" License
749
stars
218
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
NCCL Tree allreduce test cannot reach the theoretical bus bandwidth on 2 nodes with 4 nics
#232
ProHuper
closed
16 hours ago
0
Test NCCL failure common.cu:997 'internal error
#231
sdonoso
closed
1 week ago
9
what is cu:990 error? how to solve this problem?
#230
MAKER-park
opened
2 weeks ago
5
2 Nodes nccl-test with mpi hangs
#229
sdonoso
closed
2 weeks ago
1
has nvswitch, but uses 0 nvls channels
#228
MiyazonoKaori
closed
2 weeks ago
3
Test fail caused by ibvwrap.c:160 NCCL WARN Call to ibv_modify_qp failed with error Connection timed out.
#227
thsmfe001
closed
3 weeks ago
2
improve parsing of stepbytes (increment size) argument
#226
StefanoSalsano
opened
4 weeks ago
0
stepbytes (increment size) argument does not support 1M notation
#225
StefanoSalsano
opened
4 weeks ago
1
alltoall_perf: each rank is only sending to half of the other ranks
#224
russilwvong
closed
3 weeks ago
14
mpirun all_reduce_perf hang with multi-device test
#223
913871734
opened
4 weeks ago
0
NCCL WARN Cannot use cuda/gdr transports as part of specified UCX_TLS
#222
liuxingbo12138
opened
1 month ago
5
how to support One Device per Process?
#221
jiangxiaobin96
closed
1 week ago
4
1 GiB headroom might be too small
#220
Namnamseo
opened
1 month ago
0
Test NCCL failure common.cu:959 'internal error - please report this issue to the NCCL developers / '
#219
Assassin187
opened
1 month ago
9
Rank Assignment Issue under four containers on two different servers.
#218
thsmfe001
closed
1 month ago
8
all_reduce_perf hangs; using a single GPU on a 4GPU machine
#217
isaacgerg
closed
1 month ago
18
NCCL initialization hangs with 4 GPUs, but works with 2 GPUs
#216
mickaelseznec
opened
1 month ago
4
NCCL_ALGO on multi-node and multi-GPU
#215
MajidSalimi
opened
1 month ago
1
SendRecv Time
#214
osayamenja
opened
2 months ago
2
Nccl test seems run seperately on multi nodes
#213
jianh619
closed
2 months ago
6
H100 all reduce performance is poor
#212
liminn
opened
2 months ago
13
undefined reference nccl*
#211
gongyguo
closed
2 months ago
1
Differences problems in performance data of HGX A800 single server N GPUs nccl testing
#210
cloveryyg
opened
2 months ago
0
The network bandwidth in the alltoall_perf test failed to meet expectations
#209
fj1425fj
opened
2 months ago
4
Test NCCL failure common.cu:954 'unhandled cuda error
#208
YingYellow
closed
2 months ago
1
make failed, error -- unsupported GNU version! gcc versions later than 11 are not supported!
#207
jxh314
closed
2 months ago
0
misc/ibvwrap.cc:278 NCCL WARN Call to ibv_reg_mr_iova2 failed with error Cannot allocate memory
#206
jxh314
closed
2 months ago
2
cputime
#205
tks2004
opened
2 months ago
0
Test NCCL failure common.cu:961 'internal error - please report this issue to the NCCL developers / '
#204
a-c-dream
opened
3 months ago
7
Add bisection test
#203
x41lakazam
opened
3 months ago
3
Why getBw don't have access to agg_iters ?
#202
x41lakazam
closed
3 months ago
1
Performance lack of NCCL Test
#201
shengode503
opened
4 months ago
5
Multi node test hang phenomenon
#200
gim4moon
closed
4 months ago
2
Interaction between NCCL_IB_SL and NCCL_IB_ADAPTIVE_ROUTING
#199
DanieleDeSensi
opened
4 months ago
0
How is the maximum number of bytes for all_reduce operation calculated?
#198
jxh314
closed
2 months ago
3
How to explain Bus Bandwidth in Allreduce Operation?
#197
HydraQYH
opened
4 months ago
0
busbw exceeds network bandwidth (2 nodes, 16 gpus, 100Gbps intel NIC, no NVSwitch) - what algorithm is used?
#196
ofilip
closed
5 months ago
5
undefined reference to ncclRedOpDestroy
#195
freshduer
opened
5 months ago
2
all_reduce_perf between NVLINK connected H100 PCIe GPUs lower than A100 SXM4 GPUs
#194
chinthysl
opened
5 months ago
0
NCCL Test hang when the number of nodes goes beyond 18, and CPU usage is very high
#193
chgdragon2023
opened
5 months ago
2
NCCL Test Does not work with GID 3 or GID 1, but it works fine for GID 0
#192
chgdragon2023
opened
6 months ago
0
nccl-tests result is only a half of ib_write_bw
#191
HeGaoYuan
opened
6 months ago
0
hypercube out-of-bound errors with single-proc + `gpus-per-thread=4`, not with multi-proc + `gpus-per-thread=1`
#190
robogast
opened
6 months ago
1
clarify that the measurement is unidirectional
#189
stas00
opened
6 months ago
11
misc/socket.cc:441 NCCL WARN socketFinalizeAccept: wrong type 4 != 3
#188
MiyazonoKaori
closed
6 months ago
6
NCCL alltoall_perf hangs via PXN
#187
gavin1332
closed
6 months ago
1
how can i run nccl-test use max bandwidth
#186
liuxingbo12138
opened
7 months ago
0
misc/ibvwrap.cc:187 NCCL WARN Call to ibv_modify_qp failed with error Network is unreachable
#185
chgdragon2023
opened
7 months ago
3
Nsight Profiling: one ncclAllReduce takes too long
#184
yanminjia
opened
7 months ago
0
Test NCCL failure common.cu:954 'unhandled cuda error" when test on >2 GPUs
#183
caopulan
closed
7 months ago
4
Next