ml-unito / DeepWordLearning

0 stars 3 forks source link

Training on unseen classes; Moving to mini-Imagenet? #5

Open Pibborn opened 6 years ago

Pibborn commented 6 years ago

Vinyalis et al. introduce the mini-ImageNet dataset as a

  1. image downsampling (84x84)
  2. number of classes reduction (100)
  3. number of per-classes images reduction (600)

with respect to regular ImageNet. Some authors (I actually only know about Larochelle doing it) have adopted this as a standard of sorts for one-shot learning tasks. Should we move to this? One strong reason to do this is to enable better comparisons with one-shot learning literature: the CNN model should have not seen the classes you are performing one-shot learning on (see note 1); so we should probably train our models again if we want to compare to those papers. As previously discussed, this is quite useless as of now as we are not doing one-shot learning, but it is a direction I would be interested in.

Notes 1: this is interesting to me, as it feels that the similarity between the classes the base model has been trained on and the classes we are evaluating on could be a major driving force of the final one-shot accuracy performance. What about the following experiment:

  1. average the L2 distance between CNN features for a given class.
  2. the one-shot model is doing way better on the closest classes, no matter the one-shot model one is using. point being, the CNN features are still doing the most of the weight-lifting. or maybe some one-shot models contradict this assumption, which would be kind of interesting!

A general note: one-shot models have not the same level of "software maturity" as other deep models do. Probably hard to find them/have to move to LUA to compare them (the one-shot LSTM stuff by Larochelle is in regular torch as an example) .

Pibborn commented 6 years ago

Some ideas/instructions about how to move to miniImagenet.

I was able to find some code online to preprocess ImageNet to obtain mini-ImageNet, but to be honest it looks quite rough and some people have had issues with it.

I should probably start from the tf-slim ImageNet code and build some scripts around it that perform the downsampling/number of classes/number of examples reduction and release it. It should also probably create TFRecords. To this purpose, this tf tutorial will probably be helpful.