Excuse me, how to fine tune thie llm model?

replit / ReplitLM

Inference code and configs for the ReplitLM model family

https://huggingface.co/replit

Apache License 2.0

925 stars 80 forks source link

Excuse me, how to fine tune thie llm model? #4

Open kanseaveg opened 1 year ago

kanseaveg commented 1 year ago

I am a beginner in the LLM direction, I would like to ask if the official can provide some methods or tutorials to fine-tune this LLM model, I am very grateful for this, thank you. @madhavatreplit @Replit

Symbolk commented 1 year ago

+1, the doc says that "Replit intends this model be used by anyone as a foundational model for application-specific fine-tuning without strict limitations on commercial use.", so some tutorial or runnable code are really appreciated!

madhavatreplit commented 1 year ago

Sorry for getting to this late!

The model is a subclassed Hugging Face model and should be compatible with the usual process of fine-tuning a pretrained Huggingface model.

The following guide should be a good place to start. You can use accelerate for distributed training and other standard tooling in the Huggingface ecosystem as needed.

Let me know if you run into any issues and happy to dig deeper and help out.

sadiqj commented 1 year ago

@madhavatreplit I think there's some code missing from this model when I look at the mosaic_gpt it was trained with. ComposerMosaicGPT seems to have loss functions and code for rearranging the targets, which isn't in the ReplitLM source.

The following code taken from that guide doesn't seem to work: https://gist.github.com/sadiqj/d7ccef396e123e1c6a23c4e3a6479549

It results in TypeError: ReplitLM.forward() got an unexpected keyword argument 'labels'

madhavatreplit commented 1 year ago

Hey thanks for flagging this, taking a look and will get back to you and update here.

pirroh commented 1 year ago

Hey folks! We just updated the model on Hugging Face and the README file on this repo with detailed instructions on how to fine-tune and instruct-tune the model. Let us know in case you have any issues. Have fun tuning, and post here your results and derivative models :)