Train D0 model on COCO2017 with EfficientNet pre-trained weights && results is not good enough.

zylo117 / Yet-Another-EfficientDet-Pytorch

The pytorch re-implement of the official efficientdet with SOTA performance in real time and pretrained weights.

GNU Lesser General Public License v3.0

5.2k stars 1.27k forks source link

Train D0 model on COCO2017 with EfficientNet pre-trained weights && results is not good enough. #410

Open Wanjpeng opened 4 years ago

Wanjpeng commented 4 years ago

@zylo117 Thanks for your great work! I'm now training EfficientDet-D0 on COCO2017 dataset with EfficientNet pretrained weights(https://github.com/lukemelas/EfficientNet-PyTorch). After around 100 epochs get train loss(total loss:0.221), while val loss(total loss:0.305), and total loss on val set can't descent when reached 0.305. This seemed to be overfitting. Here are my questions:

Does the initializing method of BiFPN layers and output layers weights affect the results huge?(For using EfficientNet(backbone) pretrained weights, but not BiFPN and output layers)
Whether to change preprocessing if using pretrained weights(EfficientNet weights with advprop).

zylo117 commented 4 years ago

No.
Yes. Without proper preproessing, the pretrained backbone weights is trash

Wanjpeng commented 4 years ago

@zylo117 So, does you mean if I used pretrained weights(with advprop): I should use normalize = transforms.Lambda(lambda img: img * 2.0 - 1.0) rather than:normalize = transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225])?

zylo117 commented 4 years ago

Yes, modify this line.

https://github.com/zylo117/Yet-Another-EfficientDet-Pytorch/blob/master/utils/utils.py#L82

from normalized_imgs = [(img / 255 - mean) / std for img in ori_imgs]

to normalized_imgs = [img / 255 * 2 - 1 for img in ori_imgs]

Wanjpeng commented 4 years ago

@zylo117 Thanks for you help! I will change it and give it a shot.

Wanjpeng commented 4 years ago

@zylo117 Still can not get a good result, the best mAP on COCO2017 valset is only 0.272, which is far lower than yours.

Wanjpeng commented 4 years ago

@zylo117 For overfitting occured, I think train data augmentation should be applied. I was wondering what kind of data augmentation method did you used? In your code there is no any augmentation method used, only image normalizing method. Thanks very much.

zylo117 commented 4 years ago

Because the official repo doesn't apply augmentation other than h-flip unless on d6 and d7, neither does mine. Can you share your loss graph? It's overfitting already?

Wanjpeng commented 4 years ago

Here are my training logs.(链接: https://pan.baidu.com/s/1qRd0HiAULa6JfNggc41L5g 提取码: qqc9) Including training logs and tensorboard logs. Thank you very much. @zylo117

Wanjpeng commented 4 years ago

I think it has become overfitting since finetune with efficientnet weights. Around 60 epochs(train_woadv_7.10_09.56.log).

zylo117 commented 4 years ago

It has. Maybe try larger batchsize or SGD.

Wanjpeng commented 4 years ago

So, did you meet with the overfitting problem?

Wanjpeng commented 4 years ago

Hello, @zylo117 , I read the official EfficientDet repo and find this, the official codes does use these augmentation methods. 1594788081(1) (Official repo: aug/autoaugment.py/distort_image_with_autoaugment())

Wanjpeng commented 4 years ago

bakirsw commented 3 years ago

Hi mate @Wanjpeng Can you please share your changes in the code? I want to train the D0 model on coco2017 with EfficientNet pretrained weights