fizyr / keras-retinanet

Keras implementation of RetinaNet object detection.
Apache License 2.0
4.37k stars 1.96k forks source link

Pretrained models for other backbone models #300

Closed ChengshuLi closed 4 years ago

ChengshuLi commented 6 years ago

Hi,

Thank you for the great work! Is there any chance you may release the pretrained models for other backbone models, e.g. resnet101, resnet152 or mobilenet128_1.0, mobilenet128_0.75, mobilenet160_1.0? Currently we only have pretrained models for resnet50.

That would be super helpful for transfer learning. Otherwise, I might need to train on COCO from scratch.

Thanks a lot!

hgaiser commented 6 years ago

Considering our resources for this project are limited, we don't provide the pretrained models for the other architectures. If in the future we happen to have trained these architectures on COCO we Will probably make them publicly available. For now, your best bet is to start with imagenet trained weights and then fine-tune on COCO or your own dataset.

ChengshuLi commented 6 years ago

Cool. Thanks for letting me know!

hgaiser commented 6 years ago

I assigned the label help wanted. We'd be happy to add pretrained (COCO/Pascal) networks to this repository if they are provided to us, but there is a risk that the architecture changes which causes those models to become obsolete. If that is the case, we likely won't update the pretrained models (except for ResNet50 on COCO).

lvaleriu commented 6 years ago

For training on coco what are the parameters? batch_size=1, flip_x augmentation? (if that matters)

Yes, models might change a bit. It might be a good idea to use the official keras repository models (from applications), the ones from https://github.com/keras-team/keras-contrib or copy them directly in this repository (but we still need to link to the imagenet weights).

hgaiser commented 6 years ago

For training on coco what are the parameters? batch_size=1, flip_x augmentation? (if that matters)

Yeah, only those.

lvaleriu commented 6 years ago

Started training on COCO (train2017+ val2017) using mobilenet224_1.0 + batch_size=1 + flip_x + image_min_side=800, image_max_side=1333+ perform NMS per class+ FPN correction

I'll keep updating this post with training results.

Epoch 2: 10000/10000 [==============================] - 2939s 294ms/step - loss: 3.7631 - regression_loss: 2.8676 - classification_loss: 0.8955

Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.004 Average Precision (AP) @[ IoU=0.50 | area= all | maxDets=100 ] = 0.009 Average Precision (AP) @[ IoU=0.75 | area= all | maxDets=100 ] = 0.002 Average Precision (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.002 Average Precision (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.005 Average Precision (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.005 Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 1 ] = 0.049 Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 10 ] = 0.097 Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.102 Average Recall (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.018 Average Recall (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.076 Average Recall (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.169

Epoch 4: 10000/10000 [==============================] - 2810s 281ms/step - loss: 3.3364 - regression_loss: 2.5441 - classification_loss: 0.7923

Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.009 Average Precision (AP) @[ IoU=0.50 | area= all | maxDets=100 ] = 0.020 Average Precision (AP) @[ IoU=0.75 | area= all | maxDets=100 ] = 0.007 Average Precision (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.006 Average Precision (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.014 Average Precision (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.011 Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 1 ] = 0.088 Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 10 ] = 0.197 Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.231 Average Recall (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.096 Average Recall (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.239 Average Recall (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.348

Epoch 6: 10000/10000 [==============================] - 2757s 276ms/step - loss: 3.1225 - regression_loss: 2.3957 - classification_loss: 0.7268

Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.019 Average Precision (AP) @[ IoU=0.50 | area= all | maxDets=100 ] = 0.040 Average Precision (AP) @[ IoU=0.75 | area= all | maxDets=100 ] = 0.016 Average Precision (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.013 Average Precision (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.029 Average Precision (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.023 Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 1 ] = 0.111 Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 10 ] = 0.227 Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.259 Average Recall (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.103 Average Recall (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.271 Average Recall (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.384

Epoch 8: 10000/10000 [==============================] - 4962s 496ms/step - loss: 2.9642 - regression_loss: 2.2937 - classification_loss: 0.6706

Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.030 Average Precision (AP) @[ IoU=0.50 | area= all | maxDets=100 ] = 0.060 Average Precision (AP) @[ IoU=0.75 | area= all | maxDets=100 ] = 0.025 Average Precision (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.017 Average Precision (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.040 Average Precision (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.037 Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 1 ] = 0.126 Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 10 ] = 0.255 Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.301 Average Recall (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.137 Average Recall (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.329 Average Recall (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.424

Epoch 11: 10000/10000 [==============================] - 3330s 333ms/step - loss: 2.7947 - regression_loss: 2.1787 - classification_loss: 0.6160

Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.046 Average Precision (AP) @[ IoU=0.50 | area= all | maxDets=100 ] = 0.090 Average Precision (AP) @[ IoU=0.75 | area= all | maxDets=100 ] = 0.041 Average Precision (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.024 Average Precision (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.062 Average Precision (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.057 Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 1 ] = 0.140 Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 10 ] = 0.278 Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.332 Average Recall (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.169 Average Recall (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.369 Average Recall (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.448

Epoch 13: 10000/10000 [==============================] - 28566s 3s/step - loss: 2.7168 - regression_loss: 2.1301 - classification_loss: 0.5867

Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.057 Average Precision (AP) @[ IoU=0.50 | area= all | maxDets=100 ] = 0.108 Average Precision (AP) @[ IoU=0.75 | area= all | maxDets=100 ] = 0.053 Average Precision (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.032 Average Precision (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.076 Average Precision (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.070 Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 1 ] = 0.152 Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 10 ] = 0.296 Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.351 Average Recall (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.181 Average Recall (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.390 Average Recall (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.468

Epoch 16: 10000/10000 [==============================] - 8350s 835ms/step - loss: 2.6045 - regression_loss: 2.0449 - classification_loss: 0.5596

Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.067 Average Precision (AP) @[ IoU=0.50 | area= all | maxDets=100 ] = 0.128 Average Precision (AP) @[ IoU=0.75 | area= all | maxDets=100 ] = 0.063 Average Precision (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.034 Average Precision (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.084 Average Precision (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.084 Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 1 ] = 0.160 Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 10 ] = 0.300 Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.347 Average Recall (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.186 Average Recall (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.381 Average Recall (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.455

Epoch 18:

10000/10000 [==============================] - 21270s 2s/step - loss: 2.5862 - regression_loss: 2.0303 - classification_loss: 0.5559

Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.070 Average Precision (AP) @[ IoU=0.50 | area= all | maxDets=100 ] = 0.133 Average Precision (AP) @[ IoU=0.75 | area= all | maxDets=100 ] = 0.066 Average Precision (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.036 Average Precision (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.089 Average Precision (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.086 Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 1 ] = 0.157 Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 10 ] = 0.295 Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.334 Average Recall (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.177 Average Recall (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.365 Average Recall (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.443

Epoch 20: 10000/10000 [==============================] - 2790s 279ms/step - loss: 2.4398 - regression_loss: 1.9232 - classification_loss: 0.5166

Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.085 Average Precision (AP) @[ IoU=0.50 | area= all | maxDets=100 ] = 0.158 Average Precision (AP) @[ IoU=0.75 | area= all | maxDets=100 ] = 0.083 Average Precision (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.043 Average Precision (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.103 Average Precision (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.104 Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 1 ] = 0.170 Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 10 ] = 0.320 Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.369 Average Recall (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.211 Average Recall (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.401 Average Recall (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.477

ghost commented 6 years ago

@lvaleriu How is your mobilenet training process? My training on COCO using densenet169 as backbone gives only a MAP of 0.028 at epoch 24.

lvaleriu commented 6 years ago

@panda9095 Very bad. So i'll start again using the FPN-CORRECTION.

hgaiser commented 6 years ago

Actually it got merged into master.

smehdia commented 6 years ago

Could I use mobilennet initial weights from here? https://github.com/experiencor/basic-yolo-keras does it work?

lvaleriu commented 6 years ago

@panda9095 Started training mobilenet on coco again. I'll update the previous comment with the results after each epoch.

lvaleriu commented 6 years ago

@panda9095 It seems better now. @hgaiser Can you take a look at the learning progression? I've never trained resnet50 from scratch on coco till now and dont have a reference for the learning curve.

jjiunlin commented 6 years ago

Here are the results after training mobilenet224_1.0 for 140+ epoch(keras-retinanet0.2,batch_size=1) Every epoch takes 60min on my single 1080ti.The GPU utilization is 90%+. The red line is mobilenet224_1.0 and the orange line is res50_retinanet.It seems that the loss decrease very slow. The learning rate change because i keep training from epoch 100 using --weights command.

screen shot 2018-03-10 at 10 45 52 am screen shot 2018-03-10 at 10 46 15 am screen shot 2018-03-10 at 10 46 33 am
sujeet-gandhi commented 6 years ago

@lvaleriu, can u please explain why do we get 6 values of precision and recall?

lvaleriu commented 6 years ago

As defined in http://cocodataset.org/#detection-eval, here are the 12 metrics:

image

sujeet-gandhi commented 6 years ago

@lvaleriu Thanks.

ray-lee-94 commented 5 years ago

Actually I need the more powerfull backbone support ,such as ResNeXt, or the SE-ResNeXt. Of course I tried by myself , but the performence dropped a litter while I excepted for higher. Maybe it's because that I used the customed dataset which contains about 10K images. I will train on the COCO. If there is any idea for higher performence, I would be gratefull

hgaiser commented 5 years ago

PRs for those backbones would be very welcome.

Pretraining on COCO sounds like the right thing to do, it also gives you a better measure of how well the backbone works.

TimoK93 commented 5 years ago

Hey all! I just tried to train a net with mobilenet160_0.75 as backbone. I just added "--backbone mobilenet160_0.75" to the command provided in the README.md for training on csv datasets. It is throwing an error while creating mobilenet in site-package keras-applications. Did i forget an argument?

hgaiser commented 5 years ago

That's better suited for a separate issue (also, mention the error, it helps to find the cause).

scstu commented 5 years ago

@lvaleriu could you please share with me the Pretrained models for mobilenet128_1.0 backbone ?

liminghuiv commented 5 years ago

For mobilenet, I saw keras-retinanet is used in vehicle detection: https://github.com/yangliupku/retinanet_detection Can someone merge it?

hgaiser commented 4 years ago

I'm closing this in favor of https://github.com/fizyr/keras-retinanet/issues/1161

hasan-mh-aziz commented 4 years ago

Actually I need the more powerfull backbone support ,such as ResNeXt, or the SE-ResNeXt. Of course I tried by myself , but the performence dropped a litter while I excepted for higher. Maybe it's because that I used the customed dataset which contains about 10K images. I will train on the COCO. If there is any idea for higher performence, I would be gratefull

Where did you get the pretrained weights of ResNext? Which implementation of ResNext did you follow?

Mansi2487 commented 3 years ago

Actually I need the more powerfull backbone support ,such as ResNeXt, or the SE-ResNeXt. Of course I tried by myself , but the performence dropped a litter while I excepted for higher. Maybe it's because that I used the customed dataset which contains about 10K images. I will train on the COCO. If there is any idea for higher performence, I would be gratefull

Did you trained ResNeXt on COCO? If yes can you please provide me with the pretrained model.