Dahoas / QDSyntheticData

11 stars 16 forks source link

Data Distillation Can Be Like Vodka: Distilling More Times For Better Quality #216

Open Dahoas opened 5 months ago

Dahoas commented 5 months ago

To close this issue open a PR with a paper report using the provided report template.

kcoost commented 4 months ago

I tried reading this paper but it's kinda poorly explained. Or maybe I'm just too dense or lacking the required background knowledge on dataset distillation through gradient matching. Their results seem promising (large performance increase compared to random/k-center data selection) though, but it's not clear to me how much extra compute you would need to arrive at their distilled dataset. Would love to hear other people's opinions on this paper.