issues
search
microsoft
/
mscclpp
MSCCL++: A GPU-driven communication stack for scalable AI applications
MIT License
246
stars
38
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
Add kernel-based verification for executor_test
#378
yzygitzh
opened
7 hours ago
0
[Bug] flush() hang bug.
#377
TonyWu199
closed
3 days ago
0
Improve CMake options
#376
chhwang
opened
6 days ago
0
NVLS support for msccl++ executor
#375
Binyang2014
opened
6 days ago
0
[docs] fix quickstart link
#374
jeffra
closed
1 week ago
0
Update README.md
#373
adk9
opened
1 week ago
0
Fix in-place all-gather input buffer in executor_test
#372
yzygitzh
closed
1 week ago
0
Support Simple protocol plans using scratch buffer
#371
yzygitzh
opened
2 weeks ago
0
Update docker image for cuda12.4
#370
Binyang2014
closed
2 weeks ago
0
Fix algo repo name
#369
Binyang2014
closed
2 weeks ago
0
[Feature] Immediate data upon signal & wait
#368
chhwang
opened
2 weeks ago
0
Fix copyright messages
#367
chhwang
closed
2 weeks ago
0
signal/poll optimization
#366
liangyuRain
opened
2 weeks ago
9
Executor AllGather In-Place Support
#365
caiomcbr
closed
2 weeks ago
0
Perf optimization & support clipping
#364
chhwang
closed
2 weeks ago
1
Fix NCCL API bugs
#363
chhwang
closed
2 weeks ago
2
[Perf] Failed to reproduce the performance result for Single-node AllReduce mentioned in README.md
#362
FC-Li
closed
6 days ago
5
[Feature] Can NVIDIA and AMD communicate?
#361
liuyang6055
closed
2 weeks ago
1
Allreduce performance optimization and correctness fix
#360
nusislam
opened
1 month ago
1
[Bug] Can't launch allreduce test
#359
chenhongyu2048
closed
1 month ago
1
Update integration-test-rocm.yml for Azure Pipelines
#358
Binyang2014
closed
1 month ago
0
Update ROCm CI
#357
chhwang
closed
1 month ago
1
Fix NPKit exit event offset
#356
yzygitzh
closed
1 month ago
0
Use IB transport flags only when an IB device exists
#355
chhwang
closed
1 month ago
0
[TEST PR] Update ib.cc
#354
EricWangCN
closed
1 month ago
0
Fixing RegisterMemory Allocation for ProxyChannels
#353
caiomcbr
closed
1 month ago
0
Fixing RegisterMemory Allocation for ProxyChannels
#352
caiomcbr
closed
1 month ago
0
Add proxy channel related operations
#351
Binyang2014
closed
1 month ago
0
Is there exist some documentation to explain the difference between allreduce algorithm in mscclpp?
#350
MARD1NO
closed
1 month ago
4
[Bug] libmscclpp_nccl fails linking using ROCm 6.0
#349
corey-derochie-amd
closed
1 month ago
2
[Doc] mscclpp docs
#348
Binyang2014
closed
2 weeks ago
0
Fix for ROCm 6.0
#347
chhwang
closed
2 months ago
0
Add CI for rocm
#346
Binyang2014
closed
1 month ago
0
Tune threads per block for mscclpp executor
#345
Binyang2014
closed
1 month ago
0
Support executors to send packets over ProxyChannel
#344
caiomcbr
closed
2 months ago
0
Fix for ROCm 6.0
#343
chhwang
closed
2 months ago
0
ProxyChannel Support in Executor
#342
caiomcbr
closed
2 months ago
0
Fix bug for construct sempaphore
#341
Binyang2014
closed
2 months ago
0
Make ibverbs optional at compile time
#340
chhwang
closed
2 months ago
0
Removing Ibverbs Dependency
#339
caiomcbr
closed
2 months ago
0
Auto-tune vector sizes for NVLS allreduce6
#338
roshandathathri
closed
2 months ago
0
Dynamically load libibverbs
#337
caiomcbr
closed
2 months ago
0
bfloat16 support
#336
chhwang
closed
2 months ago
0
[Bug] `mscclpp/concurrency_device.hpp: No such file or directory`
#335
TZHelloWorld
closed
3 months ago
2
Fix missing import in executor test
#334
yzygitzh
closed
3 months ago
0
Update quickstart.md
#333
chhwang
closed
3 months ago
0
Add support for different vector sizes in multimem instructions
#332
roshandathathri
closed
3 months ago
0
NCCL API Executor Integration
#331
caiomcbr
closed
3 months ago
0
NCCL API Executor Integration
#330
caiomcbr
closed
3 months ago
0
Executor integration for NCCL APIs
#329
caiomcbr
closed
3 months ago
0
Next