issues
search
shawntan
/
scattermoe
Triton-based implementation of Sparse Mixture of Experts.
Apache License 2.0
147
stars
9
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
Can't use torch.compile
#12
shikhartuli
opened
4 days ago
0
Question: Multi-node training
#11
casper-hansen
opened
2 months ago
3
Model with balanced load runs slower than the imbalanced
#10
CanyonWind
closed
2 months ago
3
No module named 'torch'
#9
winkelstein
opened
2 months ago
4
ParallelLinear with bias
#8
CanyonWind
closed
2 months ago
2
Megablocks example
#7
ehartford
opened
3 months ago
0
Experts with different capacity
#6
CanyonWind
closed
2 months ago
4
Accuracy Issues
#5
jeromeku
closed
2 months ago
11
Segfault CUDA 12.2
#4
sshleifer
opened
3 months ago
1
pytest fail
#3
Eutenacity
closed
3 months ago
6
Mixtral inference example
#2
casper-hansen
closed
2 months ago
5
Tensor Parallelism
#1
timmytwoteeth
opened
3 months ago
3