Lichang-Chen / AlpaGasus

A better Alpaca Model Trained with Less Data (only 9k instructions of the original set)
https://lichang-chen.github.io/AlpaGasus/
19 stars 3 forks source link
data-centric-ai filtering-data instruction-following large-language-models

AlpaGasus: Training a Better Alpaca with Fewer Data (ICLR 2024)

Lichang Chen*, Shiyang Li*, Jun Yan, Hai Wang, Kalpa Gunaratna, Vikas Yadav, Zheng Tang, Vijay Srinivasan, Tianyi Zhou, Heng Huang, Hongxia Jin

*Denotes equal contribution

Project page | Paper


Our Model "AlpaGasus"is pronounced as "/ˈælpəˈɡeɪsəs/", or "/ˈælpəˈɡəsəs/". The logo is generated by Midjourney

News

Citation

If you find our paper useful, please consider citing:

@inproceedings{
    chen2024alpagasus,
    title={AlpaGasus: Training a Better Alpaca with Fewer Data},
    author={Lichang Chen and Shiyang Li and Jun Yan and Hai Wang and Kalpa Gunaratna and Vikas Yadav and Zheng Tang and Vijay Srinivasan and Tianyi Zhou and Heng Huang and Hongxia Jin},
    booktitle={The Twelfth International Conference on Learning Representations},
    year={2024},
    url={https://openreview.net/forum?id=FdVXgSJhvz}
}