Unsupervised training possible?

mikeybellissimo / LoRA-MPT

A repo for finetuning MPT using LoRA. It is currently configured to work with the Alpaca dataset from Stanford but can easily be adapted to use another.

Apache License 2.0

18 stars 7 forks source link

Hi, happy to hear you're enjoying it!

You'd just have to change the data and the way you preprocess the it (you wouldn't want to do the prompting for example) and also switch the collator from a Seq2Seq to a LanguageModelling one. There's a good tutorial on hugging face for this available here: https://huggingface.co/docs/transformers/tasks/language_modeling#causal-language-modeling

Once you change the data and the preprocessing/collator then the rest is pretty much the same.
Good luck and feel free to reach out if you have more questions!

mikeybellissimo / LoRA-MPT

Unsupervised training possible? #1