zylo117 / Yet-Another-EfficientDet-Pytorch

The pytorch re-implement of the official efficientdet with SOTA performance in real time and pretrained weights.
GNU Lesser General Public License v3.0
5.2k stars 1.27k forks source link

[help wanted] Training with Street View Data #603

Open YvetteLi opened 3 years ago

YvetteLi commented 3 years ago

Hi,

I have been trying to train a detection model for street view data, but the result has not been ideal.

Here is a list of steps I have tried

  1. Adjust the anchor ratio from the tool provided by k_means_anchor_size.ipynb as recommended
  2. Followed the same procedure as train_birdview_vehicles.ipynb First train with the head_only as True with 32 as batch size then fine tune with head_only set to False with 8 as batch size.
! python train.py -c 0 -p ext_img_all --head_only True --lr 5e-3 --batch_size 32 --load_weights weights/efficientdet-d0.pth --saved_path model_ckpt/ --num_epochs 10 --save_interval 218
! python train.py -c 0 -p ext_img_all --head_only False --lr 1e-3 --batch_size 8 --load_weights=model_ckpt/ext_img_all/efficientdet-d0_9_2180.pth --num_epochs 100 --save_interval 1744

The classification loss seems to reach the minimum at epoch of 50, but the mAP is quite low.

logs/ext_img_all/efficientdet-d0_43_38368.pth Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.124 Average Precision (AP) @[ IoU=0.50 | area= all | maxDets=100 ] = 0.196 Average Precision (AP) @[ IoU=0.75 | area= all | maxDets=100 ] = 0.134 Average Precision (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.007 Average Precision (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.124 Average Precision (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.243 Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 1 ] = 0.139 Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 10 ] = 0.175 Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.175 Average Recall (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.007 Average Recall (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.195 Average Recall (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.353

I have uploaded the images and the configuration file to wangpan, could you please take a look at your convenience?

链接: https://pan.baidu.com/s/11PvUiPd5t3rrkiOV_WiFrw 密码: glnf

zylo117 commented 3 years ago

can you share your tensorboard loss graph? Zoom in to see if there is a overfitting. If not, keep training. mAP means AP across all classes. But all of your anchors are vertical rects, and when it comes to horizontal objs like some lights, signs or whatever, it may fail to detect, hence low mAP.