Closed vim5818 closed 4 years ago
@ppwwyyxx Thank you for the reply, I have checked the tutorial on GoogleCoLab. As per the Segment: "Run a pre-trained detectron2 model", I am able to visualise the Information of the bounding boxes. But, I do not see such variable or line of Code in cloned repository of detectron2. After a complete search across different executable file and Folders , i dont see any exact line of Code as mentioned in colab tutorial.
Please support. Thank you.
The tutorial shows how to "print the list of objects detected along with the co-ordinates of Bounding Box." in https://colab.research.google.com/drive/16jcaJoc6bCFAQ96jDe2HwtXj7BMD_-m5#scrollTo=7d3KxiHO_0gb as you asked.
Tutorials show how a user can use detectron2, so the content does not need to be part of the repository.
@ppwwyyxx . Thank you once again. I understand my query should have been correctly framed. I would like to reframe my query.
Need support for the point 3.
The code will run on a PC if you write the code in a python file on the PC and execute the python file.
@ppwwyyxx by default, You have a lot of executables like visualizer.py, box_regression.py in the project, but it is unclear which executable exactly gives the final BB output after detection. I would like to know if there is any file from which I can extract the same information as in colab. Maybe, I can workout from there.
No files in the repository gives the coordinates of bounding boxes. The code in colab shows how to get the coordinates of bounding boxes.
@deeplearner93 . Hi. This is just an example. Detectron2 has the file /detectron2/demo/predictor.py. The file /detectron2/demo/predictor.py is called by the file /detectron2/demo/demo.py We will invoke the file /detectron2/demo/demo.py to do the test. https://github.com/facebookresearch/detectron2/tree/master/demo
PART1 STEP1. Open the file /detectron2/demo/predictor.py
STEP2 Edit the function run_on_image(self, image) in following way. The last instruction in the function run_on image is: return predictions, vis_output Add before the last instruction (the instruction return) the following instructions print
print(instances) print(instances.pred_boxes) print(instances.pred_boxes[0])
OUTPUT AND EXPLANATION I got these outputs.
A) OUTPUT OF print(instances) Instances(num_instances=4, image_height=360, image_width=640, fields=[pred_boxes, scores, pred_classes, pred_masks])
Explanation: this output says me there are 4 boxes detected.
B) OUTPUT OF print(instances.pred_boxes) Boxes(tensor([[289.3555, 17.8171, 451.1482, 347.6050], [382.5501, 14.9712, 635.7133, 231.8446], [467.1654, 66.3414, 611.7201, 226.0997], [ 22.4782, 3.7928, 428.1484, 254.6716]]))
Explanation: this output says me, the coordinates of the boxes detected. In particular, the first box (instances.pred_boxes[0]) has the top_left point with coordinates (x,y)=(289.3555, 17.8171), and the bottom_right point with coordinates (x,y)=(451.1482, 347.6050)
C) OUTPUT OF print(instances.pred_boxes[0]) Boxes(tensor([[289.3555, 17.8171, 451.1482, 347.6050]])) Explanation: with this command, I just print the coordinates of the first box (instances.pred_boxes[0])
PART2 SEE ALSO A) https://detectron2.readthedocs.io/tutorials/models.html#model-output-format B) https://github.com/facebookresearch/detectron2/issues/356
PART3 This is my code, basically I have added 3 instructions PRINT, before of the instruction RETURN, in the file https://github.com/facebookresearch/detectron2/blob/master/demo/predictor.py
def run_on_image(self, image):
vis_output = None
predictions = self.predictor(image)
# Convert image from OpenCV BGR format to Matplotlib RGB format.
image = image[:, :, ::-1]
visualizer = Visualizer(image, self.metadata, instance_mode=self.instance_mode)
if "panoptic_seg" in predictions:
panoptic_seg, segments_info = predictions["panoptic_seg"]
vis_output = visualizer.draw_panoptic_seg_predictions(
panoptic_seg.to(self.cpu_device), segments_info
)
else:
if "sem_seg" in predictions:
vis_output = visualizer.draw_sem_seg(
predictions["sem_seg"].argmax(dim=0).to(self.cpu_device)
)
if "instances" in predictions:
instances = predictions["instances"].to(self.cpu_device)
vis_output = visualizer.draw_instance_predictions(predictions=instances)
print(instances)
print(instances.pred_boxes)
print(instances.pred_boxes[0])
return predictions, vis_output
PART4 To test my code I run these commands in the bash shell. COMMAND1: cd /000myfiles/anacondadir1/detectron2/demo COMMAND2: python3 demo.py --config-file ../configs/COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_3x.yaml --input my_image.jpg --opts MODEL.DEVICE cpu MODEL.WEIGHTS detectron2://COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_3x/137849600/model_final_f10217.pkl &
@kenny1323
Wow!!. Thank you very much !! All "Bow" to your work.
Hi, I have a problem. In my case I want the box coordinates as individual values because i need to extract the detected image from the main image. I can get all the coordinates as below: Boxes(tensor([[2054.7739, 287.8489, 2595.0151, 728.5417]], device='cuda:0')) But I have not been able to save each element as an individual element (x1=2054.7739, y1=287.8489...) I need each element to crop the image and get only the detected element. I try to convert the box element to list (.tolist) but that didn't work. Eny help?
@Warday. Hi. Here you can find my directory /detectron2/demo https://github.com/kenny1323/detectron2_ken
PART1 About the box extraction I have added 2 files. 1)cp demo.py extract_person_box.py; 2)cp predictor.py extract_person_box_core.py
I have edited extract_person_box.py and extract_person_box_core.py in the following way.
The file extract_person_box.py basically is the same of the file demo.py, there are only few differences. The file extract_person_box_core.py has a new block of code tagged START_BOXES_ECTRACTION Inside the file extract_person_box_core.py, in particular search the instruction crop.
You should read the file readme.txt too. https://github.com/kenny1323/detectron2_ken/blob/master/README.txt
F="/SUPERDIR1"/allfile/1.png; cd /000myfiles/anacondadir1/detectron2/demo python3 extract_person_box.py --config-file /000myfiles/anacondadir1/detectron2/configs/COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_3x.yaml --input $F --opts MODEL.WEIGHTS detectron2://COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_3x/137849600/model_final_f10217.pkl & sleep 3
PART2 About the mask extraction I have added 2 files. 1)cp demo.py extract_mask.py; 2)cp predictor.py extract_mask_core.py
I have edited extract_mask.py and extract_mask_core.py in the following way.
The file extract_mask.py basically is the same of the file demo.py, there are only few differences. The file extract_mask_core.py has a new block of code tagged START_MASK_EXTRACTION. The image /detectron2/demo/000028.jpg._out1.png is an example of mask extraction. Basically, the alpha channel of any pixel of the mask is set to zero. url_image: https://github.com/kenny1323/detectron2_ken/blob/master/000028.jpg._out1.png
F="/SUPERDIR1"/allfile/1000.png; cd /000myfiles/anacondadir1/detectron2/demo python3 extract_mask_cumulative.py --config-file /000myfiles/anacondadir1/detectron2/configs/COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_3x.yaml --input $F --opts MODEL.WEIGHTS detectron2://COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_3x/137849600/model_final_f10217.pkl & sleep 3
Post Scriptum. About the image 000028.jpg._out1.png, you should invert the transparency, namely: for any pixel with alpha channel 0, change it to alphachannel=255; and any pixel with alpha channel not 0, change it to alphachannel=0;
Hi @Warday ,
Based on the @deeplearner93 image attached on this issue,
you can just like
output_pred_boxes = outputs["instances"].pred_boxes
for i in output_pred_boxes.__iter__():
print(i.cpu().numpy())
you will get individual bounding boxes at ease.
Thanx kenny1323, reading source code from extract_mask_core.py I could extract each box. thanx elmonisch i will check what is faster. I did Box= outputs["instances"].pred_boxes a=Box.tensor.cpu() a=a.numpy() and then navigate in each box thanx for both answers
Hey @kenny1323 ! I want to get the bounding boxes of person reidentification system. Can you help?
@kenny1323 , Thanks a lot for your explanation here.
I have been trying to understand what the print(outputs["instances"].pred_boxes)
represents. I now know that they represent the coordinates of the boxes detected. But why are these coordinates in decimals (or float values)? Why are they not whole numbers?
Normally we would have coordinates starting at (0,0) in the top left corner of the image and the next pixel would be (0,1) in (x,y) format. But, as shown by @deeplearner93 here, we obtain values like (126.6035,244.8977). Why is this the case?
@kenny1323, @hszkf, and @ppwwyyxx - if you are aware, I request you to please help me get a better understanding of this.
@sushmasuresh28 I'm having the same confusion. Were you able to get an answer to your question elsewhere? Using the values from pred_boxes does not allow me to crop out the objects which if they were truly coordinates, I should be able to use them for cropping detected objects.
@kenny1323 , Thanks a lot for your explanation here.
I have been trying to understand what the
print(outputs["instances"].pred_boxes)
represents. I now know that they represent the coordinates of the boxes detected. But why are these coordinates in decimals (or float values)? Why are they not whole numbers?Normally we would have coordinates starting at (0,0) in the top left corner of the image and the next pixel would be (0,1) in (x,y) format. But, as shown by @deeplearner93 here, we obtain values like (126.6035,244.8977). Why is this the case?
@kenny1323, @hszkf, and @ppwwyyxx - if you are aware, I request you to please help me get a better understanding of this.
@sushmasuresh28 i think Detectron internally works in this way. It use several algorithms and models to do several estimations (predictions) of several areas. The number 126.6035 basically is the average result.
For example, assume Detecron does 3 estimations, 127, 126, 127. The average value is 126.66667.
@sarahdorich
To use the number 126.6035 to crop the image, probably you should convert it to integer.
x=round(126.6035)
now x is 127
To crop the image, I use PIL. https://stackoverflow.com/questions/9983263/how-to-crop-an-image-using-pil
Hello all, I would like to get the Co-ordinates of Bounding Box of a particular predicted object in the image. For example in the below mentioned link, the image has different objects detected by Detectron2 like cyclists, bottle, person,etc Detectron2 image at source
What output I am expecting
I would like to get the Co-ordinates of bounding box of the 2 water bottes fixed on the bicycle frame. Maybe store as text file to infer later or print them to understand which Co-ordinates of bounding box belongs corresponds to which object. As we have many objects in a single image, I would like to print the list of objects detected along with the co-ordinates of Bounding Box.
Thank you in advance.