issues
search
HazyResearch
/
m2
Repo for "Monarch Mixer: A Simple Sub-Quadratic GEMM-Based Architecture"
Apache License 2.0
535
stars
43
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
What part of code handles causal modeling?
#39
d5555
opened
1 month ago
0
I am not getting relevant results with m2-bert-80M-32k-retrieval
#38
legaltextai
opened
1 month ago
3
Loco Benchmark for huggingface models
#37
sandeep-krutrim
opened
3 months ago
0
Unable to use 'convert_dataset.py' to load data
#36
sandeep-krutrim
opened
3 months ago
1
Getting Cuda error when trying to train for 8k context
#35
sandeep-krutrim
opened
3 months ago
0
What category does the M2 model belong to
#34
41924076
opened
4 months ago
2
Update embeddings_inference.py
#33
jonsaadfalcon
closed
4 months ago
0
Add change from main
#32
jonsaadfalcon
closed
4 months ago
0
Merge updates to main with branch
#31
jonsaadfalcon
closed
4 months ago
0
Embedding speed seems slow
#30
YourSaDady
opened
4 months ago
2
Using (Absolute) Positional Embeddings with Hyena Operators
#29
saberiato
opened
5 months ago
4
torch.bmm kernel fusion
#28
Edenzzzz
opened
5 months ago
0
training data
#27
41924076
closed
4 months ago
14
Cannot use autoresume and wandb together
#26
flyingkukri
opened
7 months ago
1
About the versions of PyTorch, CUDA, and other dependencies used in the implementation of the Monarch Mixer
#25
yingxuanhi
opened
7 months ago
20
Update embedding code
#24
jonsaadfalcon
closed
4 months ago
1
LoCo Benchmark - BM25 & Insights
#23
calpt
opened
7 months ago
9
A question on square matrices
#22
GiftedNovaHD
opened
8 months ago
1
precision on imagenet experiment
#21
Karami-m
opened
8 months ago
1
Multilingual?
#20
RibinMTC
opened
8 months ago
1
Why is there such a big difference in cosine similarity between embeddings of the same pair when using padding=max_length versus padding=true?
#19
qianyue76
opened
8 months ago
2
Adding python wheels
#18
michaelfeil
opened
8 months ago
0
Update README.md
#17
eltociear
closed
7 months ago
0
does M2 work with ONNX?
#16
andersonbcdefg
opened
8 months ago
3
Missing Licence
#15
michaelfeil
closed
8 months ago
1
readme
#14
simran-arora
closed
8 months ago
0
MNLI yaml config
#13
sukjunhwang
closed
9 months ago
0
M2 model is applied on single image deraining model based on transformer
#12
yingxuanhi
opened
10 months ago
5
What should I do if the √N is not an integer?
#11
ChaiJinhao
closed
10 months ago
3
Is there any detailed explanation of M2-VIT?
#10
ChaiJinhao
opened
11 months ago
0
can i use MonarchMixer replace cross attention lay
#9
autumn-2-net
opened
11 months ago
4
MonarchMixerLayer
#8
jeohalves
opened
11 months ago
1
Will M2-GPT be open-sourced?
#7
yangsp5
opened
11 months ago
13
Code for model architecture
#6
jimmylihui
opened
11 months ago
0
Some minor inconsistency in the paper
#5
radarFudan
closed
7 months ago
2
Bert-like implementation
#4
lhallee
closed
11 months ago
4
Code for projecting pre-trained BERT weights into Monarch matrices
#3
sinamps
opened
1 year ago
2
Fixing README instructions for git clone
#2
Mocuto
closed
1 year ago
1
`python3 test_flash_mm.py` got error
#1
tiendung
closed
1 year ago
5