redotvideo / mamba-chat

Mamba-Chat: A chat LLM based on the state-space model architecture 🐍
Apache License 2.0
878 stars 68 forks source link

TypeError: MixerModel.__init__() got an unexpected keyword argument 'bos_token_id' #8

Open xiechengmude opened 7 months ago

xiechengmude commented 7 months ago

I train the model via axolot .

Heres the chat.py error:

Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained. Traceback (most recent call last): File "/workspace/mamba-chat/xdan-chat.py", line 12, in model = MambaLMHeadModel.from_pretrained(model_path, device="cuda", dtype=torch.float16) File "/root/miniconda3/envs/axo/lib/python3.10/site-packages/mamba_ssm/models/mixer_seq_simple.py", line 231, in from_pretrained model = cls(config, device=device, dtype=dtype, kwargs) File "/root/miniconda3/envs/axo/lib/python3.10/site-packages/mamba_ssm/models/mixer_seq_simple.py", line 190, in init self.backbone = MixerModel( TypeError: MixerModel.init() got an unexpected keyword argument 'bos_token_id'

justusmattern27 commented 7 months ago

It looks like some huggingface-specific arguments (bos_token_id) might be passed to the model automatically, but there's very little context here to understand what happens exactly. Could you maybe share some more code, specifically how you initialize the model?