issues
search
laekov
/
fastmoe
A fast MoE impl for PyTorch
https://fastmoe.ai
Apache License 2.0
1.52k
stars
182
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
前向传播返回值缺少bal_loss
#209
tisgotos
opened
1 day ago
1
您好,请问Megatron-LM的v2.2版本在哪里获取?
#208
tisgotos
closed
1 day ago
7
打开Smart schedule运行examples/transformer-xl/scripts/run_enwik8_base_moe.sh 报错
#207
WhatBrain
opened
1 week ago
1
No hiding output when using `pytest -s`
#206
roastduck
closed
3 months ago
0
Make the code neutral to device by removing `.cuda()`
#205
roastduck
closed
3 months ago
0
FasterMoE Shadow Policy: Detailed Inquiry
#204
Guodanding
closed
4 months ago
7
Update readme-cn.md
#203
HelloWorldLTY
closed
4 months ago
0
DDP error
#202
Peg-Wu
closed
5 months ago
0
CUDA memory increases after each loss.backward()
#201
sreetamasarkar
opened
5 months ago
6
Update switch_gate.py
#200
Heihaierr
closed
6 months ago
0
A bug in switch_gate
#199
Heihaierr
opened
6 months ago
6
About switch_gate
#198
Heihaierr
opened
6 months ago
1
multi-node problem
#197
Qianshaowei
opened
6 months ago
1
Example to run Megatron
#196
Juanhui28
opened
6 months ago
3
[BUG] AttributeError: module 'fmoe_cuda' has no attribute 'assign_pos_'
#195
pangsg
opened
6 months ago
3
跑FMOE的时候提示cudaErrorInvalidDevice
#194
pangsg
closed
7 months ago
6
fastmoe支持微调吗
#193
PowerDispatch
closed
7 months ago
0
fastmoe是否支持微调,page-attention,flasahattention和kvcache,混合精度等
#192
PowerDispatch
opened
7 months ago
4
请问fastmoe能被集成到VLLM里吗
#191
pangsg
opened
7 months ago
4
prep_text8.py没有该脚本
#190
PowerDispatch
closed
7 months ago
1
我们有线上沟通的群吗
#189
PowerDispatch
opened
7 months ago
1
你好,我想请问下在fastmoe中如何定义 dp+mp下的moe
#188
daixiangzi
closed
6 months ago
6
This PR resolves issue #186
#187
Cobalt-27
closed
7 months ago
0
num_experts argument error for Megatron-LM
#186
Cobalt-27
closed
7 months ago
0
[Feature] Make bias of gate optional for naive_gate and its subclasses.
#185
Zhang-RQ
closed
7 months ago
0
开启Smart schedule时报错Segmentation fault
#184
Xingzhi107
opened
8 months ago
8
pytest error
#183
R-QinQ
opened
8 months ago
3
setup.py error!
#182
R-QinQ
closed
8 months ago
4
ImportError: cannot import name 'get_args' from 'megatron'
#181
peter-fei
opened
8 months ago
5
During inference, the output of noisy gate is nan.
#180
zqhang
opened
9 months ago
5
Inconsistent evaluation result when clone expert parameters from original FFN
#179
Heihaierr
closed
9 months ago
1
MOELinear is much slower than torch.nn.Linear
#178
kamanphoebe
closed
9 months ago
7
ModuleNotFoundError: No module named 'fmoe_cuda'
#177
Taskii-Lei
opened
10 months ago
1
how to use balance loss?
#176
Heihaierr
opened
10 months ago
1
update clip-grad-v2.2.patch for grads_in_moe is empty
#175
Fragile-azalea
closed
11 months ago
0
Fix tests
#174
laekov
closed
12 months ago
0
Fit old code with new smgr
#173
laekov
closed
12 months ago
0
[BUG FIX] Fix bugs in stream manager.
#172
zms1999
closed
12 months ago
1
fix cublas gemm call for bf16 input
#171
xptree
closed
1 year ago
1
MOELinear always returns a zero tensor for bf16 input
#170
xptree
closed
1 year ago
1
MoE L2 norm reduce in Megatron
#169
blankde
closed
1 month ago
3
No overlapping observed when enabling Smart Scheduling
#168
chenyu-jiang
opened
1 year ago
8
Update outdated README
#167
zms1999
closed
1 year ago
0
Outdated doc for smart schedule with num_expert > 1?
#166
chenyu-jiang
closed
1 year ago
1
Document for process groups
#165
laekov
closed
1 year ago
0
Doc-string / Documentation clarification for parallel groups
#164
XMaster96
closed
1 year ago
2
Only 204 unique tokens (vocabulary size) in enwik8 (transformer-XL example)
#163
chenwydj
opened
1 year ago
3
fmoe with deepspeed
#162
KimmiShi
opened
1 year ago
0
Mixture of Expert in Vison Task (Segmentation )
#161
deep-matter
opened
1 year ago
2
bf16 support
#160
laekov
closed
1 year ago
0
Next