microsoft / LMOps

General technology for enabling AI capabilities w/ LLMs and MLLMs
https://aka.ms/GeneralAI
MIT License
3.71k stars 283 forks source link

Training code unable to find for AdaptLLM #258

Closed chowkamlee81 closed 1 month ago

chowkamlee81 commented 2 months ago

For AdaptLLM, where we can find training code. Only inference codes are provided

cdxeve commented 2 months ago

Thanks for the question. We've provided code in the adaptllm repo to convert raw corpora into a reading comprehension format. After that, you’ll need to mix the converted data with general instructions from OpenOrca at a 1:1 ratio (counted by tokens).

Except for the pre-training data, our pre-training process is the same as the vanilla pre-training of language models. You may refer to our pre-training suggestions or this issue for more details.