issues
search
aws
/
aws-ofi-nccl
This is a plugin which lets EC2 developers use libfabric as network provider while running NCCL applications.
Apache License 2.0
147
stars
56
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
Fix log format string behavior
#615
bwbarrett
closed
2 months ago
0
rdma: add separate bounce buffer freelist for data (eager) messages
#614
rauteric
opened
2 months ago
9
util: Use FI_ENOPROTOOPT to check for a provider's support for option
#613
rajachan
closed
2 months ago
0
CI updates
#612
rajachan
closed
2 months ago
0
"Request completed with error" log leads to p5e cluster collapse
#611
vmarkovtsev
closed
1 month ago
0
Improve protocol selection logic
#610
bwbarrett
closed
2 months ago
0
NCCL RDMA expects fi_cq_data_entry, but OPX provider fills CQ with fi_cq_tagged_entry
#609
lsavers
closed
2 months ago
2
feat(ci/github): use docker instead of codebuild
#608
aws-nslick
closed
2 months ago
0
fix(valgrind): fix autotools mistake
#607
aws-nslick
closed
2 months ago
0
Initialization fails for OPX Libfabric Provider
#606
lsavers
closed
2 months ago
0
fix(tree): import libfabric's container_of macro
#605
aws-nslick
closed
2 months ago
0
Add Multiplexed-round-robin scheduler
#604
arunkarthik-akkart
closed
1 month ago
3
platform: trn1 default protocol send receive
#603
hunnorth
closed
2 months ago
5
Fix: access domain from ep during mr on device
#602
maxtmann
closed
2 months ago
1
feat(build): disable semantic interposition
#601
aws-nslick
closed
1 month ago
2
freelist: separate out metadata from user data
#600
rauteric
opened
2 months ago
4
Seg Fault during RDMA NCCL Connection with OPX Provider
#599
lsavers
closed
2 months ago
4
fix(sendrecv): fix a memory leak
#598
aws-nslick
opened
2 months ago
0
No include folder after installation
#597
YJHMITWEB
closed
2 months ago
5
feat(build): better --enable-debug defaults
#596
aws-nslick
closed
2 months ago
0
fix(platform-aws): fill all platform values
#595
aws-nslick
closed
1 month ago
0
fix(tree): use empty brace initializers for zero-initialization
#594
aws-nslick
closed
1 month ago
2
fix(tracing/nvtx): silence -Wmissing-field-initializer warnings
#593
aws-nslick
closed
1 month ago
0
feat(ci): add package generation
#592
aws-nslick
closed
1 month ago
0
feat(rdma): constrain C linkage to init
#591
aws-nslick
closed
1 month ago
2
fix(tracing): use header-only nvtx3
#590
aws-nslick
closed
2 months ago
0
fix(build): check features before mangling CFLAGS
#589
aws-nslick
closed
1 month ago
1
feat(build): add -Wextra to "picky" compiler flags
#588
aws-nslick
closed
1 month ago
0
fix(test): fix typing issues
#587
aws-nslick
closed
1 month ago
0
fix(rdma): avoid enum/integral comparison
#586
aws-nslick
closed
1 month ago
0
fix(tree): add fallthrough switch markers
#585
aws-nslick
closed
1 month ago
1
register_mr_buffers:544 NCCL WARN NET/OFI Unable to register memory (type = 2) for device 0. RC: -22, Error: Invalid argument
#584
visatish
opened
2 months ago
9
fix(tuner): don't choose NVLSTree if nRanks==nNodes
#583
AmedeoSapio
closed
2 months ago
1
chore(.github/workflows): constrain push triggers to known branches
#582
aws-nslick
closed
2 months ago
2
fix(cuda): avoid loading stub
#581
aws-nslick
closed
1 month ago
4
.ci/aws: Stop Running ofi nccl functional tests until they are fixed
#580
a-szegel
closed
2 months ago
1
.ci/aws: Pin p4/p5 ami's to AMI's from 8/7/24
#579
a-szegel
closed
2 months ago
2
chore(build): replace `-Wc++-compat' with `-x c++'
#578
aws-nslick
closed
1 month ago
0
fix(neuron): remove const from ncclNetPlugin_v{4,5} syms
#577
aws-nslick
closed
1 month ago
0
fix(sendrecv): add missing nccl-headers include
#576
aws-nslick
closed
1 month ago
0
fix(tree): avoid sign comparison issues
#575
aws-nslick
closed
1 month ago
1
fix(rdma): use COMM_ID_MASK as invalid id
#574
aws-nslick
closed
1 month ago
2
fix(tuner): fix implicit conversions
#573
aws-nslick
closed
1 month ago
0
fix(idpool): avoid sign comparison issues
#572
aws-nslick
closed
2 months ago
1
fix(param): move some parameters to unsigned
#571
aws-nslick
closed
1 month ago
2
feat(param): add uint parameter macro
#570
aws-nslick
closed
1 month ago
1
fix(tuner): avoid gotos
#569
aws-nslick
closed
1 month ago
0
feat(test): parse as c++ source
#568
aws-nslick
closed
1 month ago
0
chore(build): mpi: set mpicxx, too.
#567
aws-nslick
closed
1 month ago
2
chore(build): add AC_PROG_CXX
#566
aws-nslick
closed
1 month ago
1
Previous
Next