AkihikoWatanabe / paper_notes

たまに追加される論文メモ
https://AkihikoWatanabe.github.io/paper_notes
15 stars 0 forks source link

Dataset Distillation with Attention Labels for Fine-tuning BERT, ACL'23 #827

Open AkihikoWatanabe opened 1 year ago

AkihikoWatanabe commented 1 year ago

https://virtual2023.aclweb.org/paper_P5706.html

AkihikoWatanabe commented 1 year ago

Dataset distillation aims to create a small dataset of informative synthetic samples to rapidly train neural networks that retain the performance of the original dataset. In this paper, we focus on constructing distilled few-shot datasets for natural language processing (NLP) tasks to fine-tune pre-trained transformers. Specifically, we propose to introduce attention labels, which can efficiently distill the knowledge from the original dataset and transfer it to the transformer models via attention probabilities. We evaluated our dataset distillation methods in four various NLP tasks and demonstrated that it is possible to create distilled few-shot datasets with the attention labels, yielding impressive performances for fine-tuning BERT. Specifically, in AGNews, a four-class news classification task, our distilled few-shot dataset achieved up to 93.2\% accuracy, which is 98.5\% performance of the original dataset even with only one sample per class and only one gradient step.

Translation (by gpt-3.5-turbo)

AkihikoWatanabe commented 10 months ago

Datadistillationしたら、データセットのうち1サンプルのみで、元のデータセットの98.5%の性能を発揮できたという驚異的な研究(まえかわ君)