How to extract the features from bounding boxes?

paulcx commented 7 years ago

I wonder if I can use the frcnn to extract the features after bbox regression. Then I need to add one more layer like xgboost to output some possibility. For example, I trained some classes like ears,noses,eyes and I want to use these features from such selected bboxes to predict age. It's grateful if anyone can help.

LHagendoorn commented 7 years ago

I'm pretty sure the features you are looking for can be extracted from the roi-pooling layer

louisquinn commented 7 years ago

@paulcx,

(Assuming you're using VGG16). You can get the features in the following way:

Modify line 176 in test.py to grab the output from the 'conv5_3' layer:

input_list = [net.get_output('cls_score'), net.get_output('cls_prob'), net.get_output('bbox_pred'),net.get_output('rois'), net.get_output('conv5_3')]
cls_score, cls_prob, bbox_pred, rois, features = sess.run(
                                                input_list,
                                                feed_dict=feed_dict,
                                                options=run_options,
                                                run_metadata=run_metadata)

Return the "features" variable to your main application.
Get the ROIs that are above your CONF_THRESH (similar to the demo).
Scale those ROIs back down to the feature level size. For VGG16, it is downsample by 16 in height and width.

Extract the scaled ROIs for all 512 dims:

for roi in rois:
xmin = roi[0]
ymin = roi[1]
xmax = roi[2]
ymax = roi[3]
extracted_feature = features[0, ymin:ymax, xmin:xmax, :]
# Do some stuff here

Check line 160 in test.py. This code is used when RPN is disabled and you specify the ROIs to detect. There is a mention of avoiding anti-aliasing when downsampling the feature space, perhaps follow their code when you downsample your ROIS.
Alternatively, you can just get the vectors from the last fully connected layer (fc7) in the same way as getting the conv5_3 features. You will have a vector of size: num_proposals x 4096 (check this). Extract the ones you want from the inds created when filtering proposals by CONF_THRESH. Then you can teach a softmax layer on these vectors for your final classes. This is good because you can avoid additional bottlenecks by connecting more fully connected layers for your age classifiers.

paulcx commented 7 years ago

Thank you @louisquinn. I'm using resnet101 from here . I'm wondering if your second approach could suit my case. General speaking of the solution would be extracting the tensors from last fc layer (or pooling layer) and teach a softmax layer after that for the classifier.

btw, the resnet on frcnn has very good detection performance.

smallcorgi / Faster-RCNN_TF

How to extract the features from bounding boxes? #148