fayeli / biojs-vis-bonestagram

DICOM medical image visualisation on the web. Google Summer of Code project with BioJS.
https://bonestagram.herokuapp.com
MIT License
5 stars 2 forks source link

Research on hand model #6

Closed sacdallago closed 8 years ago

fayeli commented 8 years ago

Accurate, Robust, and Flexible Real-time Hand Tracking http://research.microsoft.com/apps/mobile/publication.aspx?id=238453 http://research.microsoft.com/apps/mobile/Video.aspx?id=230533 The pipeline of their hand tracking algorithm:

  1. Extract square area of interest around the hand
  2. Reinitialisation: Find a number of possible hand poses. The 3D hand model is a mesh.
  3. Model fitting: Make predictions of hand poses, also use a particle system to find best fitted model

At first I was thinking whether we can reproduce the algorithm of this paper because the result is really impressive and its in real time, but then I realised they are using a single depth camera. I think that means its something like a Kinect camera, not regular webcam like the one we are gonna be using for Bonestagram. Is that right, @hesamrabeti ?

fayeli commented 8 years ago

Js-handtracking : https://github.com/jcmellado/js-handtracking A javascript library that tracks the hand based on colour range of the skin. In the demo(https://youtu.be/0I1ar9Lrhsw) it shows that the library can give an approximation of the outline of the hand. I had some trouble reproducing that due to colour difference of my camera. My colour range was pinker that what the library defined. However, we can use the idea of colour area tracking to help defined the hand surface area for projecting a DICOM image.

fayeli commented 8 years ago

A Virtual 3D Blackboard: 3D Finger Tracking using a Single Camera : http://crcv.ucf.edu/papers/fg2000.pdf This paper's method of hand tracking includes

  1. Skin detection
  2. Motion detection to find arm
  3. Find finger points on the contour
  4. Construct spheres for finger, elbow, shoulder, and use the length constraint for tracking

Very interesting ideas for hand tracking using only a single regular camera. However, with Bonestagram, we might be looking at hand tracking when the palm is closer to the camera, we might not see the arm and elbow. I think it's possible to borrow the idea of length constraint but instead do so for the finger bones, knuckles etc.

fayeli commented 8 years ago

https://github.com/VasuAgrawal/GestureDetection https://www.youtube.com/watch?v=oH0ZkfFoeYU

sacdallago commented 8 years ago

Nice job @fayeli ! :) I'll read through these tomorrow, am very curious myself

fayeli commented 8 years ago

Here's a bunch of resource all related to using haar cascade, a trained set that detects certain features, which was used in the Viola-Jones algorithm we've been using for face recognition. It seems like It's possible to train a haar cascade to detect hands! https://github.com/mtschirs/js-objectdetect https://github.com/foo123/HAAR.js https://gigadom.wordpress.com/2011/10/12/hand-detection-through-haartraining-a-hands-on-approach/

fayeli commented 8 years ago

Thanks @sacdallago !

I did A LOT of reading into hand tracking the last two days. The ones here I selected because they have interesting ideas I can understand/ it's more likely I can use their ideas. But all in all, my finding is that hand tracking is a tough on-going computer vision research area. It's also particularly hard when using only one single no depth-sensing camera, like our regular webcam for Bonestagram!

However, many of the research papers I read/skim through approaches hand tracking in terms of detecting hand gestures, like whether its an OK sign and how every single fingers are crossing and curling. For Bonestagram, where we are just trying to do texture mapping, I feel like we might be able to cheat a little! I think we could make it a less complex problem by telling the user to keep a standard hand pose (like a high five towards the camera!).

For now, I think I'm gonna try implementing some simple cheating hacky hand tracking + texture mapping. Maybe we can see how well it does before trying some more crazy computer vision math? How does that sound? @hesamrabeti

Cheers, Faye (:

hesamrabeti commented 8 years ago

http://vision.in.tum.de/_media/spezial/bib/cremers_et_al_ijcv02.pdf http://vision.in.tum.de/_media/spezial/bib/cremers_ivc.pdf

Great research! I agree we should start simple.

I would suggest maybe we start by asking the user to place their hands on a hand outline and then tracking the individual fingers and palm from there. Perhaps with an optical flow algorithm like: https://inspirit.github.io/jsfeat/#opticalflowlk. Using these tracked points, we render the hand model. So we would have a minimum of 6 tracking points, 5 for the fingers and 1 for the palm. Further improvement to this model would be to have more tracking points and put constraints on the relative position of the points (shape priors: http://vision.in.tum.de/research/shape_priors).

By telling the user to place their hands in a particular location in the image we are able to customize our detection algorithm to the characteristics of the webcam, the person's skin tone, and lighting conditions.

Great job so far! I'm looking forward to what you come up with next.

hesamrabeti commented 8 years ago

Here is a good read on optical flow via the Lucas-Kanade method: http://robots.stanford.edu/cs223b04/algo_tracking.pdf