weiliu89 / caffe

Caffe: a fast open framework for deep learning.
http://caffe.berkeleyvision.org/
Other
4.77k stars 1.68k forks source link

Crop the image before detection #88

Open forever3000 opened 8 years ago

forever3000 commented 8 years ago

Hello,

I tried to install SSD framework on TK1 platform, then I run video detection example with default VGG16 network and VOC pre trained snapshot (300x300), the speed got about 4-5 fps. I'm curious whether we can select the ROI (Region of Interest) of image before detection and may it will help the speed be faster? If yes, where should I modify?

Thanks

weiliu89 commented 8 years ago

4-5 fps sounds about right. It is about 6 fps on TX1.

I don't understand what do you mean by selecting ROI? There is no ROI in SSD. You have to warp the image to 300 x 300 if you use SSD300 for detection.

forever3000 commented 8 years ago

Hi weiliu, thank for your answer. May I have one question more. When I run video detection python script, the video resolution after detection is drop down so much. May SSD capture the video frame, warp to 300x300 for detection and display back with 300x300 resolution? Is there any way to recover to original resolution after detection. Thanks

weiliu89 commented 8 years ago

You could hack the annotated_data_layer.cpp to output the original image, and connect it to detection_output_layer.c* to use it for display.

forever3000 commented 8 years ago

Yes, thank. I will try to do it.

forever3000 commented 8 years ago

Hi @weiliu89 , Sorry for open this issue again but I can not find out by myself. Can you give me more details? Thanks

karthikmswamy commented 8 years ago

Hi @forever3000 @weiliu89 I tried the forward pass evaluation on TK1 using ssd_detect.cpp and using gettimeofday(). My forward pass takes about 630ms which is about 1.5 FPS and not 4-5 FPS. Can you you suggest what could be the reason? Thank you.

mai86 commented 7 years ago

@weiliu89 Hi, weiliu! how can I hack the annotated_data_layer.cpp to output the original image, and connect it to detection_output_layer.c* to use it for display? I'm a new guy, can you give me more details? Thank you

mai86 commented 7 years ago

@forever3000 Is there any way to recover to original resolution after detection? did you solve this issues? can you share your code? Thank you

mai86 commented 7 years ago

@weiliu89 @forever3000 how can I get the original image replace the "bootom[3]" at "this->datatransformer->TransformInv(bootom[3],&cv_imgs);"

forever3000 commented 7 years ago

@mai86 , You should use ssd_detect.cpp instead of python script, with this way you can get the original resolution.

weiliu89 commented 7 years ago

@mai86 It is doable. The essence is that you should modify this, and change load_batch() in the annotated data layer to output the original image.