Closed sacdallago closed 8 years ago
Js-handtracking : https://github.com/jcmellado/js-handtracking A javascript library that tracks the hand based on colour range of the skin. In the demo(https://youtu.be/0I1ar9Lrhsw) it shows that the library can give an approximation of the outline of the hand. I had some trouble reproducing that due to colour difference of my camera. My colour range was pinker that what the library defined. However, we can use the idea of colour area tracking to help defined the hand surface area for projecting a DICOM image.
A Virtual 3D Blackboard: 3D Finger Tracking using a Single Camera : http://crcv.ucf.edu/papers/fg2000.pdf This paper's method of hand tracking includes
Very interesting ideas for hand tracking using only a single regular camera. However, with Bonestagram, we might be looking at hand tracking when the palm is closer to the camera, we might not see the arm and elbow. I think it's possible to borrow the idea of length constraint but instead do so for the finger bones, knuckles etc.
Nice job @fayeli ! :) I'll read through these tomorrow, am very curious myself
Here's a bunch of resource all related to using haar cascade, a trained set that detects certain features, which was used in the Viola-Jones algorithm we've been using for face recognition. It seems like It's possible to train a haar cascade to detect hands! https://github.com/mtschirs/js-objectdetect https://github.com/foo123/HAAR.js https://gigadom.wordpress.com/2011/10/12/hand-detection-through-haartraining-a-hands-on-approach/
Thanks @sacdallago !
I did A LOT of reading into hand tracking the last two days. The ones here I selected because they have interesting ideas I can understand/ it's more likely I can use their ideas. But all in all, my finding is that hand tracking is a tough on-going computer vision research area. It's also particularly hard when using only one single no depth-sensing camera, like our regular webcam for Bonestagram!
However, many of the research papers I read/skim through approaches hand tracking in terms of detecting hand gestures, like whether its an OK sign and how every single fingers are crossing and curling. For Bonestagram, where we are just trying to do texture mapping, I feel like we might be able to cheat a little! I think we could make it a less complex problem by telling the user to keep a standard hand pose (like a high five towards the camera!).
For now, I think I'm gonna try implementing some simple cheating hacky hand tracking + texture mapping. Maybe we can see how well it does before trying some more crazy computer vision math? How does that sound? @hesamrabeti
Cheers, Faye (:
http://vision.in.tum.de/_media/spezial/bib/cremers_et_al_ijcv02.pdf http://vision.in.tum.de/_media/spezial/bib/cremers_ivc.pdf
Great research! I agree we should start simple.
I would suggest maybe we start by asking the user to place their hands on a hand outline and then tracking the individual fingers and palm from there. Perhaps with an optical flow algorithm like: https://inspirit.github.io/jsfeat/#opticalflowlk. Using these tracked points, we render the hand model. So we would have a minimum of 6 tracking points, 5 for the fingers and 1 for the palm. Further improvement to this model would be to have more tracking points and put constraints on the relative position of the points (shape priors: http://vision.in.tum.de/research/shape_priors).
By telling the user to place their hands in a particular location in the image we are able to customize our detection algorithm to the characteristics of the webcam, the person's skin tone, and lighting conditions.
Great job so far! I'm looking forward to what you come up with next.
Here is a good read on optical flow via the Lucas-Kanade method: http://robots.stanford.edu/cs223b04/algo_tracking.pdf
Accurate, Robust, and Flexible Real-time Hand Tracking http://research.microsoft.com/apps/mobile/publication.aspx?id=238453 http://research.microsoft.com/apps/mobile/Video.aspx?id=230533 The pipeline of their hand tracking algorithm:
At first I was thinking whether we can reproduce the algorithm of this paper because the result is really impressive and its in real time, but then I realised they are using a single depth camera. I think that means its something like a Kinect camera, not regular webcam like the one we are gonna be using for Bonestagram. Is that right, @hesamrabeti ?