lm-sys / FastChat

An open platform for training, serving, and evaluating large language models. Release repo for Vicuna and Chatbot Arena.
Apache License 2.0
36.63k stars 4.52k forks source link

T5 model - what patern? #876

Closed lizelive closed 1 year ago

lizelive commented 1 year ago

I love the T5 model.

https://github.com/lm-sys/FastChat/blob/a26db3c814889035d92c8ae80d6defbd7381ee55/fastchat/train/train_flant5.py#LL170C12-L170C12

It seems to use ### USER: but I thought moved over to using <s> to separate?

merrymercy commented 1 year ago

The T5 model is still using the"### Human/Assistant" separator style, which is the same as vicuna-v0. So it is using this conversation template https://github.com/lm-sys/FastChat/blob/ea6c7b6da47d15d6e3264d0abba7b8d1090479a4/fastchat/conversation.py#L192

@DachengLi1 We may consider moving to the new style with EOS as the separator.