OpenKinect / libfreenect2

Open source drivers for the Kinect for Windows v2 device
2.08k stars 752 forks source link

Face Tracking #332

Closed robotsorcerer closed 9 years ago

robotsorcerer commented 9 years ago

I'm doing face tracking on a (depth -> colored) registered image. I am done with the face detection and what not but I am having a hard time with tracking and objects' estimation from the field of view of the kinect camera center. Specifically, I want to retrieve the corresponding pixel values from the tracked features of the face from the depth image. For example, suppose I pick an (x, y) point in the registered image (suppossedly colored frame), how do I find the corresponding (x,y) point in the depth image without resolving to looking up the intensity values in each of the colored and depth images?

xlz commented 9 years ago

Right now the color image is registered onto the depth image. (x,y) is the same for both images.

robotsorcerer commented 9 years ago

Thanks, @xlz . I appreciate the answer.

Maybe my question wasn't as clear as I thought earlier. So I am restating it a bit more lucid here.

Basically, I detect eyes within faces' ROI in the colored frame. Let's call a typical detected eye center coordinate, (x, y). Since the (x,y) values are the same for all three images, I want to pick up the equivalent depth value that corresponds to (x,y) in the depth image. I do this:

float depthIntensity = undistorted.at<float>(y, x);

to look up the corresponding undistorted depth float value and I discover that my depth points are not invariant to background color.

Whenever an object appears in the background of the detected object, even though the detected object's eye center is static (I am using a medical manikin at the moment), the depth value seemed to change according to the sort of color in the background. My questions:

(i) Should this be? Or am I incorrectly accessing the depth values? (ii) Why are we dividing all the depth frames by 4,500.0f as in here. (ii) Also pardon my question if it is a tad silly, should the depth intensities be in pixels or millimeters or meters? (iii) If the way I am reading the depth values is wrong, what is the most appropriate way, in your opinion, to look up corresponding depth values?

Looking forward to your reply!

xlz commented 9 years ago

(i) Should this be? Or am I incorrectly accessing the depth values?

It should not. I think your way of access is correct. But I don't see how it results in the situation you described. If it is in the background, it is occluded and not visible. How is depth changed by invisible background color, is there an example?

(ii) Why are we dividing all the depth frames by 4,500.0f

It is a normalizing factor for visualization. 4500 millimeters as the nominal maximum distance.

should the depth intensities be in pixels or millimeters or meters?

It is in millimeters. Depth values come from 11-bit fixed point numbers. The best unit for that dynamic range is millimeter.

(iii) If the way I am reading the depth values is wrong

Your way is right.

robotsorcerer commented 9 years ago

Thanks @xlz .

How is depth changed by invisible background color, is there an example?

I'm away from my computer right now but my observation is that if I stay behind the manikin head during tracking, I notice the depth values of the tracked eye point change even when the manikin is static.

Moreso, the retrieved depth valuesI am getting are in decimal(up to 5 decimal places if I remember correctly). Moving the object to/away from the camera between 0.5 and 1.5m in ground truth distance produces a measly 0.18xxx +/- 0.5 units in the depth values. Have you tried to evaluate the accuracy/precision of the kinect depth camera? How accurate are measurements you have retrieved from the TOF Kinect?

xlz commented 9 years ago

the depth values of the tracked eye point change even when the manikin is static

Is it a small error or large bias?

0.18xxx +/- 0.5 units

What does this mean? From -0.32 meter to 0.68 meter?

A small depth measurement offset is a known issue: https://github.com/OpenKinect/libfreenect2/issues/144

robotsorcerer commented 9 years ago

Sorry for the ambiguity. I rushed through typing that. I meant a corresponding 0.18 to ~0.7 meter.

robotsorcerer commented 9 years ago

I think I forgot to multiply the depth intensity I was retrieving by 4500.0f. It's a bit reasonable now (in mm) though the offset is still there.