issues
search
microsoft
/
tutel
Tutel MoE: An Optimized Mixture-of-Experts Implementation
MIT License
685
stars
85
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
Question regarding the load importance loss calculation
#240
wangyirui
opened
4 weeks ago
1
How about the cost of TUTEL features?
#239
fyang064
opened
1 month ago
1
fix(fast_dispatch): saving input tensor using ctx.save_for_backward
#238
KimmiShi
closed
1 month ago
1
Potential Memory Leak in GatingEncoder/Decoder of Fast_Dispatch
#237
KimmiShi
closed
1 month ago
1
How to use Megablocks in MoE training
#236
CSCYQJ
opened
1 month ago
1
add built-in llama_ffn; add helloworld_custom_expert_sharded;
#235
ghostplant
closed
2 months ago
1
update README.md for v0.3.2
#234
ghostplant
closed
2 months ago
0
Can tutel support Pipeline Parallel?
#233
xcwanAndy
closed
2 months ago
1
[Question] Comparison to FasterMoE
#232
Guodanding
opened
2 months ago
4
using TUTEL_GLOBAL_TIMEOUT_SEC to make NCCL timeout configurable
#231
ghostplant
closed
2 months ago
0
Qs
#230
zws98
opened
2 months ago
3
replace unnecessary zeros -> empty
#229
ghostplant
closed
3 months ago
0
enable message size larger than 4GB for all_to_all_v/all_gather_v
#228
ghostplant
closed
3 months ago
0
add tutel.examples.helloworld_demo based on custom experts
#227
ghostplant
closed
3 months ago
1
How to create a custom expert with tutel?
#226
zws98
opened
3 months ago
19
update online setup instructions
#225
ghostplant
closed
4 months ago
0
Add option to install for CPU only: export NO_CUDA=1
#224
ghostplant
closed
5 months ago
0
add device initialization for ops on non-default devices
#223
ghostplant
closed
6 months ago
0
add example files for NCCL all_to_all_v/all_gather_v
#222
ghostplant
closed
6 months ago
0
add primitives: net.batch_all_to_all_v(), net.batch_all_gather_v()
#221
ghostplant
closed
6 months ago
0
[Question] Why use datatype ncclInt8 in nccl_all_to_all_scatter_async.
#220
cicirori
opened
6 months ago
1
How to implement Fairseq-MoE training checkpoint like Swin-MoE?
#219
withinmiaov
opened
8 months ago
1
Non-surface function utilities only work for contiguous input data
#218
lyd126
opened
8 months ago
12
fill zeros with warning for params not defined in state_dict
#217
ghostplant
closed
9 months ago
0
Enable running without bias and update ffn instantiation
#216
vchiley
closed
9 months ago
4
RuntimeError: (0) == (cuModuleLoadDataEx(&hMod, image.c_str(), sizeof(options) / sizeof(*options), options, values)) INTERNAL ASSERT FAILED
#215
jd730
closed
10 months ago
3
tutel is slower than the naive p2p using 2DH for small scale
#214
DongyuXu77
opened
10 months ago
3
What is the difference between this and deepspeed-moe?
#213
Hap-Zhang
closed
10 months ago
2
update tutel pipeline and setup deps
#212
ghostplant
closed
11 months ago
0
numpy not in requirements
#211
152334H
closed
11 months ago
5
updt init
#210
vchiley
opened
11 months ago
7
fix a few casts
#209
vchiley
closed
11 months ago
1
always use torch.distributed.run in new torch versions
#208
ghostplant
closed
11 months ago
0
how to use tutel on Megatron Deepspeed
#207
wangyuxin87
opened
1 year ago
4
Can this package support the one-gpu machine
#206
momo1986
opened
1 year ago
5
add more comment in helloworld_ddp example
#205
ghostplant
closed
1 year ago
0
Training with Data and Expert Parallelism
#204
santurini
opened
1 year ago
5
INTERNAL ASSERT FAILED
#203
Qicheng-WANG
opened
1 year ago
5
about compute_location and locations
#201
adverbial03
opened
1 year ago
1
add tutel.examples.helloworld_switch
#199
ghostplant
closed
1 year ago
0
ImportError: cannot import name 'tutel_custom_kernel' from 'tutel.impls.jit_compiler'
#198
zhaojiancheng007
opened
1 year ago
12
[Bug]The function func_fwd is calculated inconsistent on the cpu and gpu
#197
starkhu
closed
1 year ago
1
tutel/jit_kernels/sparse.py torch.float16 There is a bug in the calculation: the cuda calculation result is inconsistent with the CPU calculation result and the array is out of bounds
#196
WsqRichards1
opened
1 year ago
1
All2All precision always in fp32
#195
vchiley
opened
1 year ago
1
add reset_parameters fn; updt .to() fn; enable device and dtype pass thru
#194
vchiley
closed
11 months ago
1
Fix tutel compatibility in torch 2.0
#193
ghostplant
closed
1 year ago
0
How the experts' gradients are handled under data parallelism?
#192
yzs981130
opened
1 year ago
1
removed logit_scale without device casting
#191
Harsh-Sensei
closed
1 year ago
1
RuntimeError: No such operator tutel_ops::cumsum
#190
sharkdrop
opened
1 year ago
10
[installation errors] fatal error: nccl.h: No such file or directory
#189
qianyuzqy
opened
1 year ago
1
Next