bytedance flux issues - Githubissues

bytedance / flux

A fast communication-overlapping library for tensor parallelism on GPUs.

Apache License 2.0

223 stars 17 forks source link

issues

Newest

Newest Most commented Recently updated Oldest Least commented Least recently updated

[ENHANCEMENT] 你好，fp8有计划支持吗

#48 zkyue opened 3 days ago
2
[QUESTION] Why is GemmRS result on hopper nondeterministic?

#47 umiswing opened 2 weeks ago
1
[QUESTION] Not supported on A6000?

#46 Zhuohao-Li opened 3 weeks ago
3
Update __init__.py

#45 zheng-ningxin closed 1 month ago
0
[BUG] incorrect shape output from AGKernel.gather()

#44 152334H opened 1 month ago
1
[QUESTION] Can flux run on RTX 4090?

#43 qinghon opened 1 month ago
1
[QUESTION]some questions about allgather+gemm

#42 ChrisRanger opened 1 month ago
1
Update __init__.py

#41 zheng-ningxin closed 2 months ago
0
[BUG] Illegal memory with multi-node

#40 YJHMITWEB opened 2 months ago
1
[QUESTION] How does flux handle hardware resoureces competition?

#39 chenhongyu2048 opened 2 months ago
3
[QUESTION] Can Gemm_V3 be used in SM80?

#38 ginowu closed 4 weeks ago
6
[QUESTION] Why is ring mode fixed to `All2All` in `src/all_gather/ths_op/all_gather_types.h`?

#37 lucifer1004 closed 2 months ago
2
[QUESTION] The gemm time on GPU of different rank under tp8 is very different , and cause low performance

#36 Rainlin007 opened 2 months ago
8
[ENHANCEMENT] support for gpu A40

#35 1926627357 closed 2 months ago
3
[QUESTION] Why flux gemm_rs is not faster than torch?

#34 hxdtest opened 2 months ago
5
[QUESTION] How to use nvshmem?

#33 chenhongyu2048 closed 2 months ago
8
[BUG] `no_nvlink` branch failed to compile

#32 lucifer1004 closed 2 months ago
6
[QUESTION]is there a plan to support int8?

#31 Rainlin007 closed 2 months ago
1
[BUG] Failing to install byte-flux from pypi

#30 tlrmchlsmth opened 3 months ago
8
[QUESTION]How to run examples in pynvshmem

#29 TonyWu199 closed 3 months ago
1
[BUG] Can't find nccl when building from source

#28 KnowingNothing opened 3 months ago
5
[QUESTION] Are you planning on supporting FP8?

#27 MustafaFayez closed 2 months ago
3
Update README.md

#26 zheng-ningxin closed 3 months ago
0
Support pip installation

#25 zheng-ningxin closed 3 months ago
0
add torch version to the whl name

#24 zheng-ningxin closed 3 months ago
0
Support performance tunning for gemm-rs kernel on sm80

#23 zheng-ningxin closed 4 months ago
0
Remove pynvshmem import in gemm_rs_80.py

#22 tlrmchlsmth closed 4 months ago
0
Tune the AG performance for the llama-8b

#21 zheng-ningxin closed 4 months ago
0
Are there any difficulties in implementing gemm-allreduce?

#20 Rainlin007 opened 4 months ago
2
feat: fix tuning for the all-gather gemm && move the reset-signal() to the forward critical path

#19 zheng-ningxin closed 4 months ago
0
zero out all the allocated shm buffer

#18 zheng-ningxin closed 4 months ago
0
[BUG] Incorrect results from flux.AGKernel for some problem shapes

#17 tlrmchlsmth closed 4 months ago
15
Update README.md

#16 zheng-ningxin closed 4 months ago
0
Add more device types for the time estimation.

#15 zheng-ningxin closed 4 months ago
0
[BUG] Exception: not supported device NVIDIA H100 80GB HBM3

#14 wenscarl closed 4 months ago
2
using c10::intrusive_ptr<c10d::ProcessGroup> as argument from python

#13 houqi closed 4 months ago
1
fix the _allgather_base backend issue(issue11)

#12 zheng-ningxin closed 4 months ago
0
[BUG] RuntimeError: Could not retrieve or create the backend 2 for device type cuda

#11 tlrmchlsmth closed 4 months ago
14
[BUG] Illegal memory access when fuse_reduction=False

#10 tlrmchlsmth closed 3 months ago
5
Support IPC && SM90 version of AG-GEMM, GEMM-RS

#9 zheng-ningxin closed 4 months ago
0
[BUG] Illegal memory access in GemmRS when passing fuse_reduction=True and dtype=bfloat16

#8 tlrmchlsmth closed 4 months ago
4
[BUG] gemm and reduce-scatter are not overlapped

#7 wenscarl closed 4 months ago
9
Update BibTex

#6 wenlei-bao closed 5 months ago
0
Add arXiv paper link

#5 wenlei-bao closed 5 months ago
0
Reorganize and deduplicate files

#4 wenlei-bao closed 5 months ago
0
All gather and reduce scatter on SM80

#3 zheng-ningxin closed 5 months ago
0
add cutlass submodule and patches

#2 liwenchangbdbz closed 8 months ago
0
add issue template

#1 liwenchangbdbz closed 8 months ago
0