Closed dinhanhx closed 1 year ago
From the link you provided, StableVicuna = Vicuna + RLHF + Open Assistant-Anthropic-Stanford dataset.
Recently, Open Assistant, Anthropic, and Stanford have begun to make chat RLHF datasets readily available to the public. Those datasets, combined with the straightforward training of RLHF provided by trlX, are the backbone for the first large-scale instruction fintuned and RLHF model we present here today: StableVicuna.
Quoted from Vicuna's blog
Vicuna = LLaMa + user-shared conversations collected from ShareGPT
Somebody might consider StableVicuna as a LLaMa derivative. However, the training method is different and the dataset composition is quite different (due to Open Assistant, and Anthropic). Therefore, I think this work deserves its own entry.
I think that's a very fair point.
I was checking the list and I found out that there is already StableLM, and StableVicuna is released within it. So it's better to make subentry in StableLM entry
Resolved by e30682ae1de8df142372b34e151291c31dd58081
https://stability.ai/blog/stablevicuna-open-source-rlhf-chatbot