redotvideo / mamba-chat

Mamba-Chat: A chat LLM based on the state-space model architecture 🐍
Apache License 2.0
903 stars 69 forks source link

enhancement: better model_save #13

Open getorca opened 9 months ago

getorca commented 9 months ago

This pull request has 3 small changes to make saving the model easier:

1) sets a default value for _internal_call in save_model like in hf trainer https://github.com/huggingface/transformers/blob/c48787f347bd604f656c2cfff730e029c8f8c1fe/src/transformers/trainer.py#L2797C12-L2797C12 2) add arg parser for output_dir 3) calls save_model when training is complete