Training from scratch reached the performance similar with that when initializing from ImageNet

Hi there, I was trying to reproduce the results reported here on PSPNet-Resnet18. I actually trained it from stratch( --is_student_load_imgnet = False, and empty in ./ckpt/save_path/Student), so student is not loaded with any weights, but I achieved 70% acc with batch_size = 7, which is even a bit smaller than original setting. When I tried with 6000 steps, the acc is 72.7%. The behavior is actually similar with what is reported when initializing with pretrained imagenet model. Do you have any idea on that? Thanks !

irfanICMLL / structure_knowledge_distillation

Training from scratch reached the performance similar with that when initializing from ImageNet #68