[Question] Compressor fine-tune

microsoft / LLMLingua

To speed up LLMs' inference and enhance LLM's perceive of key information, compress the prompt and KV-Cache, which achieves up to 20x compression with minimal performance loss.

MIT License

4.18k stars 222 forks source link

Describe the issue

Greetings,

Are there any plans on releasing instructions or at least the dataset format so we can fine-tune the llmlingua-2-xlm-roberta-large-meetingbank or the base xlm-roberta-large into a custom compressor? If not, can you at least give some general instructions on how could we approach this issue?

Of course having a pipeline ready to simply plug the data and fine-tune the models would be amazing for simplicity sake, but it would be nice if we had more generalist and practical information on the process.

Thank you!

microsoft / LLMLingua

[Question] Compressor fine-tune #116

Describe the issue