Available Projects - Githubissues

MLEVN / projects

Open set of ideas for machine learning datasets, tasks, applications for projects, research papers and even companies

https://github.com/MLEVN/projects/issues

2 stars 0 forks source link

Available Projects #1

Open Alfo5123 opened 5 years ago

Alfo5123 commented 5 years ago

I was interested in working on the "Overfitting ability of recurrent networks" project for non-image classification task. However, I noticed on the website that there is someone already working on it. Is it possible to work on it on parallel? Or it would be better to tackle another project (I am also interested in "Dataset distillation for other tasks")? Thanks

Hrant-Khachatrian commented 5 years ago

Dear @Alfo5123,

Thanks for your interest. @tatevmejunts is doing an experiment on overfitting with LSTM networks on a sentence classification task as part of her undergrad thesis. I think it is possible to try more models on more tasks and compare the results.

No one is yet working on dataset distillation.

Alfo5123 commented 5 years ago

Thanks for the reply. I am interested in working on the Dataset Distillation project with @marinagomtsian .

Hrant-Khachatrian commented 5 years ago

Perfect. I'd suggest you to read the paper, come up with a more detailed plan for the project, and I think I'll be able to comment on / discuss it

Alfo5123 commented 5 years ago

@Hrant-Khachatrian , we discussed with @marinagomtsian to start with the following plan: 1) Find an NLP task. 2) Search for baseline pre-trained models from the chosen task. 3) Review the author's implementation of "Dataset Distillation" architecture and adapt it to our case. 4) Run several experiments for different settings and tasks (distilled dataset, domain transfer, and adversarial attacks). 5) If possible, provide a better mathematical proof as a lowerbound. We found the actual proof the authors propose is quite simplistic and not very applicable to the neural networks type of architectures used.

We created this repository to upload our progress: https://github.com/Alfo5123/Text-Distill

Alfo5123 commented 5 years ago

@Hrant-Khachatrian

I am sorry for the late reply. Although we have not been working during the last months. I would like to stress my interest in continuing and finishing this project. Currently Marina was accepted in a master program and she won't be able to continue working on this, however I have a bad job which gives me enough time to work on the project. We already included in our repository the code for the pre-trained CNN Sentence Classification model, which we will use to distill the dataset.

Thanks for your time and consideration, Sincerely yours, Alfredo

Hrant-Khachatrian commented 5 years ago

Sounds good!