Closed flaviusburca closed 1 month ago
We will publish the datasets as well in the following 2-3 weeks.
Right now the focus is to release a Ro-Mistral-7b-Intstruct
with a substantial improvement over Ro-Llama2-7b
using the same recipe and datasets.
@trebedea Are the training datasets mainly a RO sub-split of the actual datasets that contain multi-lingual data on which Llama-2 was trained on + some others? If so, any idea if a Llama-3 is on the road?
We do open research, so the technical report including datasets used for training and finetuning is public: https://arxiv.org/abs/2405.07703
Our main aim is to identify a "recipe" which allows improvement of any generic LLM, meaning also Mistral / Mixtral or Llama-3. Right now we have promising results showing that Mistral-7B can be improved using the current "recipe".
Current recipe is (on short, detailed in the paper / technical report):
Hope this makes sense.
@trebedea Yes, it does. Thanks! 👍🏼
The translated datasets are now available on HF.
Are the actual datasets open-source ? Will they be published on HuggingFace ?