I tried training the current model in the current setting i.e without batch norm and I saw jigsaw task was not getting the accuracy of 71(as reported in the original paper). After adding batch norm I was able to get an accuracy of 71 on jigsaw task, but the numbers on VOC classification was 4map points below what paper reported. Any ideas?
I tried training the current model in the current setting i.e without batch norm and I saw jigsaw task was not getting the accuracy of 71(as reported in the original paper). After adding batch norm I was able to get an accuracy of 71 on jigsaw task, but the numbers on VOC classification was 4map points below what paper reported. Any ideas?