Distilling RoBERTa-base

deepset-ai / haystack

:mag: AI orchestration framework to build customizable, production-ready LLM applications. Connect components (models, vector DBs, file converters) to pipelines or agents that can interact with your data. With advanced retrieval methods, it's best suited for building RAG, question answering, semantic search or conversational agent chatbots.

https://haystack.deepset.ai

Apache License 2.0

16.73k stars 1.83k forks source link

Distilling RoBERTa-base #2197

Closed MichelBartels closed 2 years ago

MichelBartels commented 2 years ago

Distilling RoBERTa using the approach described in the TinyBERT paper. The results of #2019 suggest that it makes more sense to proceed with a base model of RoBERTa. The Pile dataset can be used for the first distillation step.

MichelBartels commented 2 years ago

These are the results: RoBERTa base: EM: 78.4% F1: 82.6% Distilled model: EM: 76.6% EM: 81.0%

MichelBartels commented 2 years ago

The resulting model was released as deepset/tinyroberta-squad2.