redotvideo / mamba-chat

Mamba-Chat: A chat LLM based on the state-space model architecture 🐍
Apache License 2.0
878 stars 68 forks source link

Is the provided chat model trained on ultrachat_small.jsonl? #25

Open shansiliu95 opened 5 months ago

shansiliu95 commented 5 months ago

I trained for 10 epochs on ultrachat_small.jsonl, but the resulting model is much worse than the chat model provided.

venkat-p-r commented 5 months ago

Which GPU card are you using for training?