What is the experimental protocol for evaluating on ImageNet-based benchmark?

aosokin commented 4 years ago

Hi Joseph, I'm a bit confused about the experimental protocol for the ImageNet-based benchmark used in your work and would be very grateful for clarification. Specifically, when detecting a class (in some episode) do you detect this class on the query images corresponding only to the same class (option 1) or on all query images of the episode (option 2)?

The RepMet paper (the second paragraph of Section 5.2) makes me believe in option 2. File data/Imagenet_LOC/episodes/epi_inloc_in_domain_1_5_10_500.pkl (1-shot 5-way setting) that you kindly provided supports the same conclusion as all the query image of the episode are glued together to one list.

However, the code seems to support option 1. Function test_model from few_shot_benchmark seems to detect a class only on the images corresponding to the same category query_images[cat]: https://github.com/jshtok/RepMet/blob/9bdc3f20ff08a8b3ce005af327aba6bf0bb71213/fpn/few_shot_benchmark.py#L668-L693 The code uses another file from your package: output/RepMet_inloc_1shot_5way_10qpc_500epi_episodes.npz, which has all the query images separated by category.

Could you please clarify which protocol was used to obtain the results of the paper?

Best, Anton

jshtok commented 4 years ago

Hi Anton,

Thank you for your interest in our work.

The code searches for objects of all classes in all query images. The query images are separated by category for convenience.

I agree that the loop on the classes is a bit misleading. However, please notice that the variable 'cat', denoting the category the query belongs to, is not used. The predictions produced by the model are for all classes.

For a simple check, replace the loop by your own where the query images are loaded in random.

Regards, Joseph

On Tue, May 26, 2020 at 1:33 PM Anton Osokin notifications@github.com wrote:

Hi Joseph, I'm a bit confused about the experimental protocol for the ImageNet-based benchmark used in your work and would be very grateful for clarification. Specifically, when detecting a class (in some episode) do you detect this class on the query images corresponding only to the same class (option 1) or on all query images of the episode (option 2)?

The RepMet paper (the second paragraph of Section 5.2) makes me believe in option 2. File data/Imagenet_LOC/episodes/epi_inloc_in_domain_1_5_10_500.pkl (1-shot 5-way setting) that you kindly provided supports the same conclusion as all the query image of the episode are glued together to one list.

However, the code seems to support option 1. Function test_model from few_shot_benchmark seems to detect a class only on the images corresponding to the same category query_images[cat]: https://github.com/jshtok/RepMet/blob/9bdc3f20ff08a8b3ce005af327aba6bf0bb71213/fpn/few_shot_benchmark.py#L668-L693 The code uses another file from your package: output/RepMet_inloc_1shot_5way_10qpc_500epi_episodes.npz, which has all the query images separated by category.

Could you please clarify which protocol was used to obtain the results of the paper?

Best, Anton

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/jshtok/RepMet/issues/30, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACOBU6RDDVZG32DCAGWIAP3RTOLHXANCNFSM4NKF4AJQ .

aosokin commented 4 years ago

Joseph, thanks a lot for a quick reply! I guess I just got lost in the code. Best, Anton

jshtok / RepMet

What is the experimental protocol for evaluating on ImageNet-based benchmark? #30