Open Alfo5123 opened 5 years ago
Dear @Alfo5123,
Thanks for your interest. @tatevmejunts is doing an experiment on overfitting with LSTM networks on a sentence classification task as part of her undergrad thesis. I think it is possible to try more models on more tasks and compare the results.
No one is yet working on dataset distillation.
Thanks for the reply. I am interested in working on the Dataset Distillation project with @marinagomtsian .
Perfect. I'd suggest you to read the paper, come up with a more detailed plan for the project, and I think I'll be able to comment on / discuss it
@Hrant-Khachatrian , we discussed with @marinagomtsian to start with the following plan: 1) Find an NLP task. 2) Search for baseline pre-trained models from the chosen task. 3) Review the author's implementation of "Dataset Distillation" architecture and adapt it to our case. 4) Run several experiments for different settings and tasks (distilled dataset, domain transfer, and adversarial attacks). 5) If possible, provide a better mathematical proof as a lowerbound. We found the actual proof the authors propose is quite simplistic and not very applicable to the neural networks type of architectures used.
We created this repository to upload our progress: https://github.com/Alfo5123/Text-Distill
@Hrant-Khachatrian
I am sorry for the late reply. Although we have not been working during the last months. I would like to stress my interest in continuing and finishing this project. Currently Marina was accepted in a master program and she won't be able to continue working on this, however I have a bad job which gives me enough time to work on the project. We already included in our repository the code for the pre-trained CNN Sentence Classification model, which we will use to distill the dataset.
Thanks for your time and consideration, Sincerely yours, Alfredo
Sounds good!
I was interested in working on the "Overfitting ability of recurrent networks" project for non-image classification task. However, I noticed on the website that there is someone already working on it. Is it possible to work on it on parallel? Or it would be better to tackle another project (I am also interested in "Dataset distillation for other tasks")? Thanks