CSAILVision / sceneparsing

Development kit for MIT Scene Parsing Benchmark
http://sceneparsing.csail.mit.edu
BSD 3-Clause "New" or "Revised" License
461 stars 194 forks source link

Faile to reproduce DilatedNet performance #12

Open DonghyunK opened 7 years ago

DonghyunK commented 7 years ago

Hi,

I am trying to reproduce DilatedNet.

However, my training results show that pixel acc : 72.4% mean acc: 38.6% mean iou: 28.7%.

Further training does not show improvement.

I am using a pre-trained net and multiple gpus with mini-batch size of 8. I did not use augmentations as the paper do not explain what augmentations are used. I expect that augmentation does affect the results at a small amount, otherwise you probably present augmentations in the paper.

(1) Could you explain what augmentations are used and how much does it improve results?

(2) Could you provide training and validation log files?

Thank you so much.

hangzhaomit commented 7 years ago

Augmentation only helps a little (<2%), we only did flipping during training. Try to initialize the model with a VGG network pretrained on ImageNet; do not add layers like batch normalization.

balloch commented 7 years ago

@DonghyunK , can you comment if the above worked? Also, @hangzhaomit , what do you mean initialize the model with a VGG pretrained on imagenet...is the DilatedNet just a standard VGG? won't the difference in convolution type cause incompatibility?