hasanirtiza / Pedestron

[Pedestron] Generalizable Pedestrian Detection: The Elephant In The Room. @ CVPR2021
https://openaccess.thecvf.com/content/CVPR2021/papers/Hasan_Generalizable_Pedestrian_Detection_The_Elephant_in_the_Room_CVPR_2021_paper.pdf
Apache License 2.0
682 stars 159 forks source link

Running Inference on Custom Dataset in CoCo-Format #136

Closed fritzeinhorn closed 2 years ago

fritzeinhorn commented 2 years ago

Hello, I find your Project really impressive and am trying to run an inference on my own Dataset that is annotated in CoCo Format. So far I've tried to paste my images into the demo/ folder and the model is working fine, showing me the outputs in the _resultdemo/ folder. I am using the colab version. However i am trying to find out how I can get the bounding box coordinates (with corresponding confidence if possible) as an output file in whichever format to run further analysis.

I greatly appreciate your help!

fritzeinhorn commented 2 years ago

As an addition, is it possible to run the inference on only part of the images without cropping them all? In this case there is a sports video with spectators where the inference should only detect Humans in a specific defined area (e.g. the Court).

hasanirtiza commented 2 years ago

Hello, I find your Project really impressive and am trying to run an inference on my own Dataset that is annotated in CoCo Format. So far I've tried to paste my images into the demo/ folder and the model is working fine, showing me the outputs in the _resultdemo/ folder. I am using the colab version. However i am trying to find out how I can get the bounding box coordinates (with corresponding confidence if possible) as an output file in whichever format to run further analysis.

I greatly appreciate your help!

I think if you go to tools/demo.py and on line 37 you get results = inference_detector(model, image), this should contain bounding boxes. You can save them in whatever format you want.

hasanirtiza commented 2 years ago

As an addition, is it possible to run the inference on only part of the images without cropping them all? In this case there is a sports video with spectators where the inference should only detect Humans in a specific defined area (e.g. the Court).

Perhaps an easy (not necessarily the smartest) way is to simply check coordinate of each individual bbox and and based on the location determine whether to keep it or not,