Closed Aisuko closed 5 months ago
Currently, we use get-2 as the base model. As we discussed, we will migrate to the model which focus more on dialogue. And we are interested in huge giant LLM with many layers redundancy.
So, I think we can implement a base line evaluate system to support us complete this task in a reasonable way. And the little bit out of date it ok for our situation.
The candidates can be
We will create a smaller model by using transfer learning techniques on the base-line model
Minimal size, and RLHF pre-trained model which are with labels 'trl', 'text-generation-inference'