issues
search
redotvideo
/
mamba-chat
Mamba-Chat: A chat LLM based on the state-space model architecture 🐍
Apache License 2.0
911
stars
69
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
cant find torch even if it installed !
#39
GiuseppeLeviBo
opened
1 month ago
1
Can the train_mamba.py be used to pretrain the model?
#38
ReaganGen
opened
1 month ago
0
fail to install requirements.txt on MAC
#37
taozhiyuai
opened
2 months ago
1
finetuning error
#36
khfs
opened
3 months ago
0
Question about how mamba chat training is done
#35
aravindkoti
opened
6 months ago
0
Error in importing from mamba_ssm.models.mixer_seq_simple import MambaLMHeadModel in GoogleCollab
#34
ankitsrivastava637
opened
8 months ago
4
committed
#33
TeamOctaLink
opened
9 months ago
0
Why choose zephyr as tokenizer?
#32
rangehow
closed
9 months ago
0
sentencepiece version
#31
nuochenpku
opened
9 months ago
0
I downloaded the mamba-790m file from Hugging Face to my local machine for loading and training. However, I encountered an error during the loading process, like that "Missing key(s) in state_dict: "backbone.layers.0.mixer.A_b_log""
#30
zxsdd9
closed
9 months ago
0
Is there Padding Mask when training the model?
#29
ZetangForward
opened
9 months ago
0
MambaConfig' object has no attribute 'to_dict'
#28
sooko
opened
9 months ago
1
update to fit MambaConfig
#27
ericoder960803
opened
9 months ago
0
How to use the model after training it ?
#26
kishore-FDI
opened
9 months ago
2
Is the provided chat model trained on ultrachat_small.jsonl?
#25
shansiliu95
opened
10 months ago
1
Feature 'cvt with .bf16' requires .target sm_80 or higher Error
#24
venkat-p-r
opened
10 months ago
1
Interesting chat example
#23
protima-banerjee
opened
10 months ago
0
MoE https://arxiv.org/abs/2401.04081
#22
Eupham
closed
10 months ago
0
ImportError: /usr/local/lib/python3.10/dist-packages/causal_conv1d_cuda.cpython-310-x86_64-linux-gnu.so: undefined symbol: _ZN3c104cuda9SetDeviceEi
#21
venkat-p-r
closed
10 months ago
1
Added check to confirm bfloat support
#20
Rohith04MVK
closed
3 months ago
1
Any plan or interest to use OpenChat algorithm (https://github.com/imoneoi/openchat) to train your chat?
#19
houghtonweihu
opened
10 months ago
0
Eval on benchmark?
#18
tic-top
opened
11 months ago
0
Error When Inferencing
#17
SeifMosaad
closed
11 months ago
1
Cant use trained model
#16
yy9996
opened
11 months ago
3
Error during training
#15
Eupham
closed
10 months ago
6
Memory requirements for training
#14
pkpro
opened
11 months ago
0
enhancement: better model_save
#13
getorca
opened
11 months ago
0
🐞 fix: triton requirement
#12
Yingyue-L
opened
11 months ago
0
Finetune on 3090 but loss equal to zero
#11
Yingyue-L
opened
11 months ago
4
How could I run this on windows 10?
#10
KevinRyu
opened
11 months ago
5
Colab notebook has error, numpy array used instead of torch
#9
microcoder-py
closed
11 months ago
7
TypeError: MixerModel.__init__() got an unexpected keyword argument 'bos_token_id'
#8
xiechengmude
opened
11 months ago
1
add missing requirements
#7
tohrnii
closed
11 months ago
2
Add a simple gradio chat ui for Mamba Chat
#6
BlenderWang9487
closed
11 months ago
1
setting device for training
#5
ekg
closed
12 months ago
1
Demo
#4
fakerybakery
closed
11 months ago
3
Issue while installing requirements.txt
#3
wereretot
opened
12 months ago
5
Bug in train_mamba.py line 53
#2
vmajor
closed
12 months ago
2
Add ability to train on smaller cards like the 24GB 3090 or 4090. Fixed epoch argument.
#1
rwl4
closed
12 months ago
1