togethercomputer / RedPajama-Data

The RedPajama-Data repository contains code for preparing large datasets for training large language models.
Apache License 2.0
4.53k stars 346 forks source link

Fine tuning RedPajama Model #61

Closed adjhawar closed 1 month ago

adjhawar commented 1 year ago

Hi, How do I finetune the RedPajama on my dataset? Is there a training script that ai can reuse?

mauriceweber commented 1 year ago

You can use the finetuning scripts in https://github.com/togethercomputer/OpenChatKit.

There is also a section in our blog post that explains how to use the scripts to finetune the RP-INCITE chat models here.