ruotianluo / pytorch-faster-rcnn

pytorch1.0 updated. Support cpu test and demo. (Use detectron2, it's a masterpiece)
MIT License
1.82k stars 475 forks source link

How can l extract the features associated to each bounding box ? #103

Open pinkfloyd06 opened 6 years ago

pinkfloyd06 commented 6 years ago

Hello @ruotianluo,

Let me first thank you for your clean code.

l would like to extract the features associated to each bounding box using demo.py as you suggested :

GPU_ID=0
CUDA_VISIBLE_DEVICES=${GPU_ID} ./tools/demo.py

Is there a way to extract the learned features ?

2) In https://github.com/ruotianluo/pytorch-faster-rcnn/blob/master/tools/demo.py#L86 scores, boxes = im_detect(net, im)returns scores.shape=(300, 21) boxes.shape=(300, 84)

l understand that there are 300 boxes. isn't it ? but l don't understand what is 21 and 84 ?

Thank you for your answer

ruotianluo commented 6 years ago

21 is the number of classes in pascal Voc. 84 is 21x4, for each class, there is a bounding box regressor.

pinkfloyd06 commented 6 years ago

Thank you for your answer @ruotianluo. 1) How can l extract the learned features associated to each bounding box ? l suppose that vectors of dimension 2048. 2) If l test this pretrained model on my custom dataset, am l supposed to get also scores and boxes of shape scores.shape=(300, 21) and boxes.shape=(300, 84) ?

ruotianluo commented 6 years ago
  1. You need to customize the network.py. Make the features exposable. The features you want is the fc7 here https://github.com/ruotianluo/pytorch-faster-rcnn/blob/7fd5263a75af9f41deecf70006fa8c0bc6b8fb60/lib/nets/network.py#L268.

  2. No. You need to change the NUM_CLASSES config.

pinkfloyd06 commented 6 years ago

Thank you for your answer.

2- There is no NUM_CLASSES in config classes https://github.com/ruotianluo/pytorch-faster-rcnn/blob/master/lib/model/config.py

3- boxes.shape=(300, 84) so boxes[0]=(84,) are boxes of the following coordinate format :
boxes[0]=[x{0,1},y{0,1},x{0,2},y{0,2},...,x{20,1},y{20,1},x{20,2},y{20,2}] ?

ruotianluo commented 6 years ago

Sorry, my mistake. it's the arguments of create_architecture function.

I can't remember clearly. You can print it out and verify what it is. I think you are right.

pinkfloyd06 commented 6 years ago

l confirm that. It is (x1,y1,x2,y2) https://github.com/ruotianluo/pytorch-faster-rcnn/blob/master/lib/nms/pth_nms.py#L10

king-zark commented 6 years ago

how can I get the feature vectors of all the RoI we selected, like in Bottom-up paper? Because fc7 seems to be the final vector, and I can not get the attention feature for image caption. @ruotianluo ,thank you @ruotianluo

king-zark commented 6 years ago

have you understood how to extract feature for each bounding box? @pinkfloyd06

king-zark commented 6 years ago

sorry, I have understood the feature. However, I meet another problem, that is: if I want to use pretrained model on MScoco dataset, the _num_anchors is wrong between 18 and 24. How should I set the parameters _anchorscales and _anchorratios @ruotianluo

Maddy12 commented 3 years ago

I would also like to understand how to extract the features for each bounding box. Can anyone share how they do this? @king-zark

JunweiLiang commented 2 years ago

You can use the output boxes to perform ROIAlign on the feature maps of the image. You can experiment with which layer to use.