mosaicml / llm-foundry

LLM training code for Databricks foundation models
https://www.databricks.com/blog/introducing-dbrx-new-state-art-open-llm
Apache License 2.0
4.06k stars 531 forks source link

MPT for Sequence Classification #611

Open boomanaiden154 opened 1 year ago

boomanaiden154 commented 1 year ago

I'm interested in using llm-foundry infrastructure for training LLMs for sequence classification/regression tasks. I currently have a fork of llm-foundry where I got this working (in a fairly hacky manner that definitely needs to be cleaned up) within the MPT models provided by the repository (creating a new MPTForSequenceRegression class and associated composer model). HuggingFace also has sequence classification versions of most of the LLMs that they have available (which would just require a composer wrapper.

Is there an interest in having tooling for sequence classification/regression live upstream in llm-foundry? I'd be interested in cleaning up and upstreaming what I have so far in addition to probably writing some documentation on performing finetuning for these tasks if such patches would be accepted.

dakinggg commented 1 year ago

Hey @boomanaiden154, the approach seems right! You should still be able to use the base HuggingFaceModel in composer, and just add the head classes as you described. Support for sequence classification/regression would be great!