issues
search
bytedance
/
flux
A fast communication-overlapping library for tensor parallelism on GPUs.
Apache License 2.0
223
stars
17
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
[ENHANCEMENT] 你好,fp8有计划支持吗
#48
zkyue
opened
3 days ago
2
[QUESTION] Why is GemmRS result on hopper nondeterministic?
#47
umiswing
opened
2 weeks ago
1
[QUESTION] Not supported on A6000?
#46
Zhuohao-Li
opened
3 weeks ago
3
Update __init__.py
#45
zheng-ningxin
closed
1 month ago
0
[BUG] incorrect shape output from AGKernel.gather()
#44
152334H
opened
1 month ago
1
[QUESTION] Can flux run on RTX 4090?
#43
qinghon
opened
1 month ago
1
[QUESTION]some questions about allgather+gemm
#42
ChrisRanger
opened
1 month ago
1
Update __init__.py
#41
zheng-ningxin
closed
2 months ago
0
[BUG] Illegal memory with multi-node
#40
YJHMITWEB
opened
2 months ago
1
[QUESTION] How does flux handle hardware resoureces competition?
#39
chenhongyu2048
opened
2 months ago
3
[QUESTION] Can Gemm_V3 be used in SM80?
#38
ginowu
closed
4 weeks ago
6
[QUESTION] Why is ring mode fixed to `All2All` in `src/all_gather/ths_op/all_gather_types.h`?
#37
lucifer1004
closed
2 months ago
2
[QUESTION] The gemm time on GPU of different rank under tp8 is very different , and cause low performance
#36
Rainlin007
opened
2 months ago
8
[ENHANCEMENT] support for gpu A40
#35
1926627357
closed
2 months ago
3
[QUESTION] Why flux gemm_rs is not faster than torch?
#34
hxdtest
opened
2 months ago
5
[QUESTION] How to use nvshmem?
#33
chenhongyu2048
closed
2 months ago
8
[BUG] `no_nvlink` branch failed to compile
#32
lucifer1004
closed
2 months ago
6
[QUESTION]is there a plan to support int8?
#31
Rainlin007
closed
2 months ago
1
[BUG] Failing to install byte-flux from pypi
#30
tlrmchlsmth
opened
3 months ago
8
[QUESTION]How to run examples in pynvshmem
#29
TonyWu199
closed
3 months ago
1
[BUG] Can't find nccl when building from source
#28
KnowingNothing
opened
3 months ago
5
[QUESTION] Are you planning on supporting FP8?
#27
MustafaFayez
closed
2 months ago
3
Update README.md
#26
zheng-ningxin
closed
3 months ago
0
Support pip installation
#25
zheng-ningxin
closed
3 months ago
0
add torch version to the whl name
#24
zheng-ningxin
closed
3 months ago
0
Support performance tunning for gemm-rs kernel on sm80
#23
zheng-ningxin
closed
4 months ago
0
Remove pynvshmem import in gemm_rs_80.py
#22
tlrmchlsmth
closed
4 months ago
0
Tune the AG performance for the llama-8b
#21
zheng-ningxin
closed
4 months ago
0
Are there any difficulties in implementing gemm-allreduce?
#20
Rainlin007
opened
4 months ago
2
feat: fix tuning for the all-gather gemm && move the reset-signal() to the forward critical path
#19
zheng-ningxin
closed
4 months ago
0
zero out all the allocated shm buffer
#18
zheng-ningxin
closed
4 months ago
0
[BUG] Incorrect results from flux.AGKernel for some problem shapes
#17
tlrmchlsmth
closed
4 months ago
15
Update README.md
#16
zheng-ningxin
closed
4 months ago
0
Add more device types for the time estimation.
#15
zheng-ningxin
closed
4 months ago
0
[BUG] Exception: not supported device NVIDIA H100 80GB HBM3
#14
wenscarl
closed
4 months ago
2
using c10::intrusive_ptr<c10d::ProcessGroup> as argument from python
#13
houqi
closed
4 months ago
1
fix the _allgather_base backend issue(issue11)
#12
zheng-ningxin
closed
4 months ago
0
[BUG] RuntimeError: Could not retrieve or create the backend 2 for device type cuda
#11
tlrmchlsmth
closed
4 months ago
14
[BUG] Illegal memory access when fuse_reduction=False
#10
tlrmchlsmth
closed
3 months ago
5
Support IPC && SM90 version of AG-GEMM, GEMM-RS
#9
zheng-ningxin
closed
4 months ago
0
[BUG] Illegal memory access in GemmRS when passing fuse_reduction=True and dtype=bfloat16
#8
tlrmchlsmth
closed
4 months ago
4
[BUG] gemm and reduce-scatter are not overlapped
#7
wenscarl
closed
4 months ago
9
Update BibTex
#6
wenlei-bao
closed
5 months ago
0
Add arXiv paper link
#5
wenlei-bao
closed
5 months ago
0
Reorganize and deduplicate files
#4
wenlei-bao
closed
5 months ago
0
All gather and reduce scatter on SM80
#3
zheng-ningxin
closed
5 months ago
0
add cutlass submodule and patches
#2
liwenchangbdbz
closed
8 months ago
0
add issue template
#1
liwenchangbdbz
closed
8 months ago
0