Open berendjansen opened 4 years ago
Hi, great that everything works already and you tested it on CIFAR-10!
We also came across the "improved" approach of them during preparation and decided not to present it to you, as not everyone will/have to go in that direction. In general, both options you proposed are fine, but option 2 might be more interesting and better manageable in terms of workload. The goal of the course is also for you to be creative and think out-of-the-box, and as TAs, we are always happy to read about new ideas in your report. Given that the code already exists for the improved paper, I assume you would spend at least one more week on reproducing that paper, and run low on time to actually explore and analyse the model which should be the focus of your report. But still, the decision is up to you, and we would be also interested in an analysis of the improved paper.
Regarding the datasets in option 2: I guess your best option is to generate datasets by yourself. For instance, you could add different backgrounds to MNIST numbers or rotate them etc. This is possibly a simpler dataset than CIFAR, and gives you more control about the level of noise you introduce to the input.
Let's discuss the details tomorrow in the werkcollege!
Hi Phillip,
Our implementation of the model is working on MNIST digits and we're currently looking for an extension. We tweaked the model to make it work for the CIFAR-10 dataset, however, the prototypes we obtained are not meaningful/descriptive, even when the classification accuracy is 65%. We assume that this due to the type of image (natural images with noise vs MNIST digits).
The authors of the original paper published a new (improved) version of their model, as they realised that their original model "fails to produce realistic prototype images" for natural images.
The new paper: http://papers.nips.cc/paper/9095-this-looks-like-that-deep-learning-for-interpretable-image-recognition
We are currently thinking about the following extensions of the paper:
Reproduce their improved paper using feature prototypes instead of complete picture prototypes. Would this be sufficient and realistic (in terms of workload) for this course?
Continue tweaking the existing model by implementing e.g.: noise reduction, VAEs, prototype weighting etc. However, for this we would need to use a dataset that is somewhere in between the MNIST digits and CIFAR-10 dataset in terms of complexity, as we already know that on complex images, this model will likely not produce meaningful prototypes.
What is your opinion about the possible options and can you think of a dataset that might work for option 2? Perhaps we can discuss this during werkcollege tomorrow.
Thanks!