smallcorgi / Faster-RCNN_TF

Faster-RCNN in Tensorflow
MIT License
2.34k stars 1.12k forks source link

Re-train model on new dataset: bbox necessary ? #118

Open MatthiasRMS opened 7 years ago

MatthiasRMS commented 7 years ago

Hi !

I'm working on retraining this model on a new dataset to extract attributes from clothes (e.g sleeves type, neck type etc). I'm only looking to the attribute prediction, not the area prediction.

I have a set with images and for each image the attributes, but I don't have the bound boxes of each attribute on the image as it is in the PASCAL VOC dataset.

Do I need the bound boxes to retrain it ?

LHagendoorn commented 7 years ago

By defenition, object detection needs bounding boxes (or segmentation or something like that) if you don't want to know where in the image the object is, it is an image classification task, and you can use just VGG16 or ResNet. I think object detection should still perform better for you, as you want to detect multiple objects in one image. So it might be worth the effort creating some annotations.

MatthiasRMS commented 7 years ago

Thanks for your answer. I do have annotations (classes present in each image), but not the boxes. You're right it's more a multi label classification task but I assumed it'd perform better with the localisation.

Can I do multiple label classification with VGG16 ?

Thanks a lot for your help

LHagendoorn commented 7 years ago

Well I do think it will do better with the localisation, and possibly you could addapt this to work without the bounding boxes. But I think the RPN will prune any boxes that have too little overlap with the ground truth box, so you need to bypass that somehow.

You should be able to do multiclass classification with any network by replacing the final softmaxes with sigmoid, and using binary_crossentropy as the loss.