Closed 89douner closed 7 years ago
The demo code is just an example which sorts the predictions by confidence and simply returns the top one (you can see this in the code here. In the standard evaluation code (e.g. for pascal evaluation here, it can return up to 200 predictions per image). It terms of realtime detection, a couple of people have mentioned to me that they used it on video. On a pascal NVIDIA gpu, with a 300x300
input image it runs at up to around 70 fps (or around 55-60 fps on older architectures).
ssd_demo_multi.zip I fixed some codes from ssd_demo.m and I succeed to detect and classify multi-object. The fixed code is ‘ssd_demo_multi.m (attaching m-file)
However, I still do not know about realtime (tracking) part. Could you tell me about tracking object using mcnSSD in a little detail? For example, reference site, any useful technique like converting video into frame..
p.s Although I want to send you this reply through email, I didn't.. because I don't know your email-address.
Hi @89douner, thanks for the updated file. To make it clearer for users, I will update the demo to show the detection of multiple objects. With regards to tracking, SSD could be used to do independent frame-by-frame tracking (i.e. simply run the detector on each image one at a time). However, it does not have a built-in mechanism for performing tracking across frames - for this you would need to add something extra to improve performance. You can extra frames directly from a video with the matlab function here.
Once simple approach could be to run the detector on all the frames, and then run a basic tracker such as KLT to join up the bounding boxes. Unfortunately, I'm not too knowledgeable on tracking so I'm not sure what the current best techniques are for this task :)
This is the main question... Your reply said "SSD processes 70 fps (frame per second). According to the theory, Should it takes ssd_demo.m less than 1 sec to detect and classify (like floating figure in MATLAB) test image in MATLAB? Although this updated code (ssd_demo.m) is upgraded, it takes ssd_demo.m about 3 sec to classify test image. So, I think that it can be a problem because the task-time (3sec) in 'ssd_demo.m' isn't equal to SSD processing time (70 fps--> about 0.01 sec (the task-time in theory). If you know this, please let me know. Thanks!
There are a few things going on here. Firstly, in the demo, there are several operations taking place (loading the model, loading the image, resizing the image, running the network, and plotting the figure). Each of these takes some time. For example, if I run on I my machine in CPU mode, I get a breakdown like this:
model loading time: 0.85 seconds
imread time: 0.12 seconds
imresize time: 0.01 seconds
net eval time: 0.71 seconds
figure generation time: 0.02 seconds
We can re-run in GPU mode, but it's worth being aware that timing things on the GPU is a little tricky. For example, if you simply run a image through the network you will get something like this:
model loading time: 0.91 seconds
imread time: 0.12 seconds
imresize time: 0.01 seconds
move net to GPU time: 3.05 seconds
net eval time: 0.888 seconds
figure generation time: 0.04 seconds
The GPU looks slower than the CPU! What is going on here? The issue is that the first time you run code on the GPU, the execution is pretty slow (there are various factors as to why this is the case). If we re-run the evaluation a few times with a for loop, we get:
model loading time: 0.92 seconds
imread time: 0.12 seconds
imresize time: 0.01 seconds
move net to GPU time: 2.79 seconds
net eval time: 1.072 seconds
net eval time: 0.032 seconds
net eval time: 0.028 seconds
net eval time: 0.026 seconds
net eval time: 0.025 seconds
figure generation time: 0.04 seconds
The first execution is slow, but the following ones are gradually quicker (now up to around 40Hz). To get comparable benchmarks to the caffe code, I followed their approach and ran with a batch size of eight images (GPUs become much more efficient when processing batches of data in parallel):
model loading time: 0.95 seconds
imread time: 0.14 seconds
imresize time: 0.04 seconds
move net to GPU time: 2.85 seconds
net eval time: 1.326 seconds
net eval time: 0.118 seconds
net eval time: 0.118 seconds
net eval time: 0.114 seconds
Once it’s warmed up, it is processing 8
images in 0.114 seconds, (i.e. 70.18 Hz). This is only on a Tesla M40 (it’s quicker on a Pascal). There are other things involved to run a proper timing benchmark though (which should include image loading and pre-processing, and not re-computing on the same image which can over-exploit the cache), but hopefully this gives you a rough idea of the relative timings :)
When I executed ssd_demo.m, I only got single detection result. As far as I know, SSD is multibox-detetor (multi-objects detection). Do you have plan to change or add the code (for multi-detection) to mcnSSD ? Also, Do you have plan to add any code (to detect video format(real-time detection)) to mcnSSD .
I will also try to make the code (for multi-detection and detecting video format) through mcnSSD.