some question about the paper

freeman-1995 commented 3 years ago

hi ~, it's a great job.

You introduce self-supervised training before downstream training. So the innovation of the paper is training a good embedding space by self-supervised. The sownstream training is the same as prototypical network. I did some research about self-supervised on resnet50 before. after self-supervised training, I put downstream task's test dataset into model, I visualize the embedding, all classes's embedding is total in a mess, I don't know it is the same in your experiments

ArnoutDevos commented 3 years ago

hi ~, it's a great job.

Thank you for checking out our repo and paper!

You introduce self-supervised training before downstream training. So the innovation of the paper is training a good embedding space by self-supervised.

Correct. The final performance is a combination of 2 things: our supervised fine-tuning procedure (ProtoTune) works together well with the self-supervised embedding (ProtoCLR) because they are based on the same principles.

The sownstream training is the same as prototypical network [ProtoNet].

More or less. Using (non-parametric) ProtoNet downstream inference on top of our embedding works already well. But with our (parametric) ProtoTune fine-tuning approach we can increase performance significantly (especially when more shots are available). See Table 3 in our paper for a comparison.

I did some research about self-supervised on resnet50 before. after self-supervised training, I put downstream task's test dataset into model, I visualize the embedding, all classes's embedding is total in a mess, I don't know it is the same in your experiments

That's unfortunate. We release our t-SNE plotting code (Figure 3 in our paper) & Conv-4 embedding models, for a selection of both training and testing classes of mini-ImageNet, here: https://github.com/indy-lab/ProtoTransfer/blob/master/omni-mini/plots/tsne_plots.ipynb https://github.com/indy-lab/ProtoTransfer/blob/master/omni-mini/plots/

Note that in Figure 3, as expected, the test classes are less structured than the training classes (because there has been 0 embedding training on the test classes). In your case, we recommend to also look at the training classes' structure and then compare with the test classes' structure. Also looking at the train/test set classification performance is a good idea.

We hope this helps!

ArnoutDevos commented 3 years ago

If this issue needs further attention, let us know. Otherwise we will close it after around 1 week of inactivity.

indy-lab / ProtoTransfer

some question about the paper #4