Closed dcahn12 closed 4 years ago
Hi, you can modify related functions, e.g.
def run_on_opencv_image(self, image, If_draw=True):
"""
Arguments:
image (np.ndarray): an image as returned by OpenCV
Returns:
prediction (BoxList): the detected objects. Additional information
of the detection properties can be found in the fields of
the BoxList via `prediction.fields()`
"""
#predictions = self.compute_prediction(image)
predictions, roi_feats = self.compute_prediction(image)
top_predictions, top_roi_feats = self.select_top_predictions(predictions, roi_feats)
result = image.copy()
if self.show_mask_heatmaps:
return self.create_mask_montage(result, top_predictions), top_predictions
result = self.overlay_boxes(result, top_predictions)
if self.cfg.MODEL.MASK_ON:
result = self.overlay_mask(result, top_predictions), top_predictions
if self.cfg.MODEL.KEYPOINT_ON:
result = self.overlay_keypoints(result, top_predictions), top_predictions
result = self.overlay_class_names(result, top_predictions)
return result, top_predictions, top_roi_feats
and
def select_top_predictions(self, predictions, roi_feats):
#def select_top_predictions(self, predictions):
"""
Select only predictions which have a `score` > self.confidence_threshold,
and returns the predictions in descending order of score
Arguments:
predictions (BoxList): the result of the computation by the model.
It should contain the field `scores`.
Returns:
prediction (BoxList): the detected objects. Additional information
of the detection properties can be found in the fields of
the BoxList via `prediction.fields()`
"""
scores = predictions.get_field("scores")
keep = torch.nonzero(scores > self.confidence_threshold).squeeze(1)
predictions = predictions[keep]
roi_feats = roi_feats[keep]
scores = predictions.get_field("scores")
_, idx = scores.sort(0, descending=True)
return predictions[idx], roi_feats[idx]
@dcahn12
Thank you for your quick response!
I think, there are some points that I don't still understand :(
First, As you can see in below picture, I just added x value from self.model() function at the function of compute_prediction() Is is this code right for the value of roi_feats ?
The code for compute_prediction()
The code for model (GeneralizedRCNN)
Second, When I check the bbox value in h5 file, the range of the bbox is not from 0 to 1 as shown in below picture.
If you give me more details for extracting roi_feat and bbox information, it would be very appreciated. :)
Sorry, I do not quite understand your problem. Do you mean that you could not get the right proposals for a given video? Just like the following?
The bbox value are not scaled by the size of the image frame, so it's not in [0, 1].
Ah, I found that the range of bbox that you gave (msrvtt_foi_box.h5) was in [0, 1] so that I thought that the value was normalized, as you can see below picture.
But, as you saw, when I tried to extract the bbox value, the range of the bbox value was not in [0, 1] :( Could you check one more about this problem?
And, as I said above, is the method I used for extracting roi_feat right?
Hi, I think it's right. Btw, the range can be normalized by the width and height of the image (from a single frame) and then you can normalize the bbox to [0,1]. e.g.
(w, h) = top_preds.size
bbox = top_preds.bbox.clone()
w_id, h_id = [0, 2], [1, 3]
bbox[:, w_id] = bbox[:, w_id]/w
bbox[:, h_id] = bbox[:, h_id]/h
@dcahn12
Hi, thank you! bbox can be extracted by this method.
But, I don't know how to extract roi_feat from mask_rcnn code linked in this repo. Where do you extract the roi_feat value from mask_rcnn code? Could you check it ?
Hi, it's in maskrcnn_benchmark/modeling/detector/generalized_rcnn.py you have post. @dcahn12
Hi, but, while the dimension of given roi feature in msrvtt_roi_feat.h5 is [num_obj, 1000], the feature dimension in maskrcnn_benchmark/modeling/detector/generalized_rcnn.py ([num_obj, 14, 14]) is not the same with given roi feature file. :( So, I was confused concerning which feature should I use for extracting roi_feat.
You can check the size once again, which should be [num_obj, 1024]. Please refer to modeling/roi_heads/box_head/roi_box_feature_extractors.py for more details.
Thank you very much!!! 👍
In your misc/extract_feats_roi.py code, the return value of functions of coco_demo.run_on_opencv_image() are result, top_preds and top_roi_feats.
But, in the maskrcnn code you linked, only one value (result) is returned as shown in the below.
How do I get the roi_feat for custom video data?