Stability-AI / StableLM

StableLM: Stability AI Language Models
Apache License 2.0
15.85k stars 1.04k forks source link

Any advice how to train a model of a different language? #66

Closed finom closed 1 year ago

finom commented 1 year ago

I'm a complete noob at NLP and LLMs and I have no idea how StableLM or LLaMA was trained. I'd like to have a model that understands and responds in Ukrainian if it's not too hard to do. I guess I need a set of prompts to train with? Can you direct me to what I need to learn and do to achieve the goal? Thank you for your work!

zhenchuan9r commented 1 year ago

I want to know it too. How much will it costs?

mcmonkey4eva commented 1 year ago

Re the original question, getting a model to speak Ukrainian: well, have you tried it? A lot of models can kinda just speak other languages out-of-the-box if you prompt them in the language. Mostly English-centric training, but it does tend to work.


StableLM's training details haven't been published yet (will be soon, watch this repo's readme for updates), but you can read LLaMA's @ https://research.facebook.com/file/1574548786327032/LLaMA--Open-and-Efficient-Foundation-Language-Models.pdf

regarding cost, from LLaMA's paper: image Market rate of an hour of A100 usage is somewhere around $1-$2 iirc? So training LLaMA-65B from scratch cost $1-2 million, but LLaMA-7B cost more like $80-160 thousand.


That said, that's all in relation to training a base model. You don't necessarily need to do that, as, well, we already got base models! You can use LLaMA or StableLM or any other base, and just finetune it, which comes in much cheaper. Full model finetunes based on LLaMA have been done successfully in the community for $100 or cheaper. Training LoRAs is even cheaper, as you can do it on consumer-tier hardware! If a LoRA is sufficient for your use case, check https://github.com/oobabooga/text-generation-webui/blob/main/docs/Training-LoRAs.md for a guide to training LoRA's using text-gen-webui.

For training a language, if the base model is bad at it out-of-the-box, you might just gather a few good references of the language (eg wikipedia in the language, maybe some books, etc) and do a short finetuning run. It doesn't necessarily need to learn how to speak the language from scratch, just has to learn grammar and words, and then cross-associate words to concepts it's already learned through English.