lm-sys / FastChat

An open platform for training, serving, and evaluating large language models. Release repo for Vicuna and Chatbot Arena.
Apache License 2.0
36.34k stars 4.47k forks source link

mistral 7b train #2861

Open bigdante opened 8 months ago

bigdante commented 8 months ago

Nice work. I am trying to use fastchat to train a mistral model. however, I wonder why the following code is hard code for only vicuna. https://github.com/lm-sys/FastChat/blob/main/fastchat/train/train.py

conv = get_conversation_template("vicuna") and assert conv.sep_style == SeparatorStyle.ADD_COLON_TWO

Did I misunderstand something?

BeastyZ commented 8 months ago

Hi @bigdante After training, can you get the added_tokens.json, config.json and other json files? If so, could you share your script or code?

jwong8314 commented 7 months ago

I wrote a patch to support dolphin format. Will share it via a PR soon, but feel free to use dolphin format for mistral in the meantime: here