This code is for the LLMEmbed paper accepted in the 62nd Annual Meeting of the Association for Computational Linguistics (ACL 2024) https://aclanthology.org/2024.acl-long.433
The rep_extract.py
uses language model to extract the representation of dataset
and saves the representation as .pt
file.
MyDataset.py
reads the representation from .pt
file.
DownstreamModel.py
is for the co-occurence pooling.
This work has been accepted to [ACL-2024](url: https://aclanthology.org/2024.acl-long.433), please cite the paper if you use LLMEmbed or this repository in your research. Thank you very much 😉
@inproceedings{chunliu2024llmembed,
title={LLMEmbed: Rethinking Lightweight LLM’s Genuine Function in Text Classification},
author={Liu, Chun and Zhang, Hongguang and Zhao, Kainan and Ju, Xinghai and Yang, Lin},
booktitle={Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)},
pages={7994--8004},
year={2024}
}