Closed nps798 closed 11 months ago
Hi, for the pretraining please have a look at LeoLM. They will also publish a paper with all details on the training soon. (I also did some pretraining for the 7b Llama2 Model, but only a fraction of LeoLMΒ΄s).
We didn't extend the original tokenizer vocabulary - actually the Llama2 tokenizer is not well suited for German text and I was myself surprised that the pretrained Mistral model is able to generate such good text despite this (but I don't know if that is different for non-romanic languages).
Will have a look and wait for their paper. Thanks
best regards.
hi. thanks for all your work! I am wondering if you would like to share with me the code you use to continue pretraining and fine-tuning in German ? do you extend the original tokenizer vocabulary?
πππ