Performance inconsistency between paper and reproduce.

zhijiew commented 3 years ago

Thank you for your great work. I learned a lot from your paper.

I tested the pre-trained models you provided (ResNet50-based for Pascal VOC, 1 shot), but I can get better performance than you reported in your paper.

Method | Split0 | Split1 | Split2 | Split3 | Mean -- | -- | -- | -- | -- | -- Reproduce | 61.8 | 69.9 | 56.3 | 56.6 | 61.2 Paper | 61.7 | 69.5 | 55.4 | 56.3 | 60.8

Is this performance fluctuation within the normal range? I used the same codes and settings in you github repo.

I also tried to train another baseline experiment (ResNet50-based for Pascal VOC, 5 shots) by myself using your configs. <html xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:dt="uuid:C2F41010-65B3-11d1-A29F-00AA00C14882" xmlns="http://www.w3.org/TR/REC-html40">

Method | Split0 | Split1 | Split2 | Split3 | Mean -- | -- | -- | -- | -- | -- Reproduce | 64.7 | 71.5 | 55.5 | 60.6 | 63.1 Paper | 63.1 | 70.7 | 55.8 | 57.9 | 61.9

There seems to be a greater fluctuation.

tianzhuotao commented 3 years ago

@zhijiew Hi, thanks for your attention.

This repo is a reproduced one that removes some redundant parameter definitions/functions compared to the original one we used to get the results in the paper. I have no idea why this repo can sometimes achieve better performance than the reported one, and the different running environments might cause the performance fluctuation.

You can find more details in this issue: https://github.com/dvlab-research/PFENet/issues/6.

The 1-shot performance variance of your reproduction is acceptable, according to the issue above.

As for the 5-shot results, in our paper, we directly tested the model trained with 1-shot with 5 support samples, but this might not be the optimal situation, which could explain why you can get a much better 5-shot result than our reported one.

Thank you.

zhijiew commented 3 years ago

Got it! Thank you very much for your reply!

dvlab-research / PFENet

Performance inconsistency between paper and reproduce. #43