ab39826 / TrafficDetection

We implement a system for vehicle detection and tracking from traffic video using Gaussian mixture models and Bayesian estimation. The system provides robust foreground segmentation of moving vehicles through a K-means clustering approximation as well as vehicle tracking correspondence between frames by correlating Kalman and particle filters.
18 stars 8 forks source link

Re-adapting the code for people Extraction #1

Open chfakht opened 8 years ago

chfakht commented 8 years ago

Thanks a lot , i was able to ran the test so easily ... But i want to make foreground extraction for people detection and the code didn't produce usefull results (example for https://www.youtube.com/watch?v=UF8uR6Z6KLc) ... Please can you help me to extract the person from a video (I know a little about kalman filter but i would know some explanation on how to change the parameters in the code too) because your implementation seems for me the most useful :) Here are some results i have obtained by running the code on : https://www.youtube.com/watch?v=OFPwDe22CoY http://hpics.li/c8cd1fc http://hpics.li/23c2b2f http://hpics.li/caa1ce2

You can remark some interferences in the captured images but i mostly think that there is some parameters i can change to make it actually work.

Thanks

ab39826 commented 8 years ago

Hey man, the problem you're trying to solve will be somewhat more difficult than my vehicle detection use case. However, I'll try and give some insight as to why it's not performing as well for your video and things you can do to potentially change that.

First, if you're only interested in isolating the person in the video frame from the background, you won't need to do any tracking. Rather, the task you're interested in accomplishing is called "foreground extraction"

I would recommend reading this paper by Stauffer and Grimson for the mathematical insight behind how this algorithm and my implementation work. http://www.ai.mit.edu/projects/vsam/Publications/stauffer_cvpr98_track.pdf

However, if you're interested in keeping track of objects/people between frames, then the kalman filtering approach is good for doing so.

The difference is highlighted in my youtube demo between the top right frame (just foreground extraction) and bottom right frame(foreground extraction with object tracking)

https://www.youtube.com/watch?v=GwyQ3QdBzaY

Based on the images that you posted, I'd try a couple of things. So, first, for my implementation, the way that the algorithm collects "evidence" for foreground vs background pictures is through the adaptive process (as described in the Stauffer paper)

For computational efficiency, I made the shortcut that if a pixel showed a low change region, then we could skip the adaptive process because I setup my camera to be stationary over the road.

In foregroundestimation.m, I have a line

pixThreshMap = min(sum(+(frameDifference(:,:,:)>pixelThresh),3),1);

That computes low change regions and skips the adaptive process at those pixel values for the given frame in time. I would recommend not taking this shortcut and collecting evidence everytime. BIG WARNING. This will make the computation runtime a LOT slower so I'd recommend running it like overnight or something. However, this should in principle remove the phantom artifact in the third image to obama's right.

Another thing you could try is to get rid of the connected component frame cleanup

[cleanFrame centroids]= connectedComponentCleanup(contrastFrame);

and just use the contrastFrame instead. This will probably help with the tiny spots you see removed on obama's face.

In terms of parameters to change, I'd suggest messing with K, T and alpha. Increasing K increases processing time but creates a more robust foreground extraction, alpha is the learning rate and is indicative of how quickly the algorithm will adapt to estimating background video components. T corresponds to a measure of the portion accounted for by background processes. There's no obvious answer to how exactly you should change these parameters other than through trial and error honestly.

The hardest thing though is that none of these videos are stationary cameras and so this will make the foreground extraction more messy. There are ways to get around this, for example, you can use a number of video processing approaches related to global motion compensation that will reduce the jitter from the camera and correspondingly improve the extraction accuracy.

Anyways, sorry this is a lotta information and kinda rambling, but hope it helps!

Anurag

On Sun, Aug 14, 2016 at 3:06 PM, chfakht notifications@github.com wrote:

Thanks a lot , i was able to ran the test so easily ... But i want to make foreground extraction for people detection and the code didn't produce usefull results (example for https://www.youtube.com/watch?v=UF8uR6Z6KLc) ... Please can you help me to extract the person from a video (I know a little about kalman filter but i would know some explanation on how to change the parameters in the code too) because your implementation seems for me the most useful :) Here are some results i have obtained by running the code on : https://www.youtube.com/watch?v=OFPwDe22CoY http://hpics.li/c8cd1fc http://hpics.li/23c2b2f http://hpics.li/caa1ce2

You can remark some interferences in the captured images but i mostly think that there is some parameters i can change to make it actually work.

Thanks

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/ab39826/TrafficDetection/issues/1, or mute the thread https://github.com/notifications/unsubscribe-auth/AJaqGxLGmNlkRAYdfMdoxU2S5sC3Puvsks5qf3VDgaJpZM4Jj-U_ .

chfakht commented 8 years ago

Hi, thanks a lot for such wonderful informations and explanations .... i'm open for even more =D. I'm actually trying to make changes and re-run the test but it will take some times .... Here is a clear look of the result : https://drive.google.com/file/d/0B9rbNYha-N6vWWVhTENJNGk0LVNmMEFoZnlPRFhHaElXSW5N/view?usp=sharing it's actually the video result. So in that time i will respond to you point by point: Firstly i wasn't so clear it's about people extraction not tracking ... Wish means as you said foreground extraction. I have already read the paper, it explain things well but it's not sufficient i think but i'll try to learn more about the adaptive process you've used. The processing is already very slow (i want actually know why ? because i've found some real tracking implementation using kalman filter) but it doing good for the results. I will try to make changes on parameters as you've described. And i would like also to know what the other vars are used for: initVariance, pixelThresh, referenceDistance, numParticles, prevCentSize, . If they can have some potential effect for my case.

The video i'm trying to use are also near stationary since it's for persons delivering a speech, so that why i get some useful results on first try to have decided to keep going with your project.

I really appreciate your help and thank you for your time. I will apply your changes and let you know of the result soon. Thanks

chfakht commented 8 years ago

Update: By commenting [cleanFrame centroids]= connectedComponentCleanup(contrastFrame); I get the following ERROR within frameIndex = 2

Warning: No video frames were written to this file. The file may be invalid. 
> In VideoWriter/close (line 307)
  In VideoWriter/delete (line 256) 
Undefined function or variable "trackFrame".

Error in foregroundEstimation (line 252)
writeVideo(foregroundVideo,trackFrame);

Can you help please.