Open Dahoas opened 5 months ago
I tried reading this paper but it's kinda poorly explained. Or maybe I'm just too dense or lacking the required background knowledge on dataset distillation through gradient matching. Their results seem promising (large performance increase compared to random/k-center data selection) though, but it's not clear to me how much extra compute you would need to arrive at their distilled dataset. Would love to hear other people's opinions on this paper.
To close this issue open a PR with a paper report using the provided report template.