microsoft / HoloLensForCV

Sample code and documentation for using the Microsoft HoloLens for Computer Vision research
MIT License
472 stars 156 forks source link

Transforming short throw depth's pixel point to world coordinate #74

Closed Joon-Jung closed 5 years ago

Joon-Jung commented 5 years ago

I am trying to transforming certain pixel point in short throw depth frame to Unity's world coordinate for object tracking and hologram augmenting. Since the projection matrix is not available, I used HoloLensForCV project as DLL in Unity to get unprojected point through MapImagePointToCameraUnitPlanemethod. I also made little modification in HoloLensForCV to access the MediaFrameReferenceto access to SpatialCoordinateSystem's TryGetTransformTomethod and get frame to Unity's world coordinate transformation matrix. I got inverted view transform matrix as well. All matrix (except the projection matrix) are obtained through locatable camera. I got camera to world transform matrix through frame to Unity's world Transform Matrix * Inverted View Transform Matrix and changed third row's sign to minus because of considering UWP's right-handed coordinate system to Unity's left-handed coordinate system.

So I tried this process to transform the short throw depth frame's pixel point to unity's world coordinates.

  1. Enable and start the short depth sensor streaming through HoloLensForCV DLL.
  2. Get sensor's software bitmap and get certain pixel point (depth point).
  3. Push the pixel point to MapImagePointToCameraUnitPlaneto get unprojected coordinate.
  4. Pack the returned value from MapImagePointToCameraUnitPlaneto vector (-X, -Y, -1) (from #63 it looks like the output of MapImagePointToCameraUnitPlaneis inverted in X and Y).
  5. Then, multiply pixel's intensity (depth) / 1000 (getting as metric) value to (-X, -Y, -1)to get point.
  6. Transform the point to Unity's world coordinate through camera to world transform matrix from above with MultiplyPoint3x4 to get world coordinate.

From my understanding and some experiments in RGB camera, the transformed world coordinate should be the point where the captured real-world object are in unity's world coordinate (For example, the hologram with transformed world coordinate should be place on the object like AR). However, the output seems like the depth camera's pose is not counted in the transformation. For example, since the depth camera is looking down, I need to place the object significantly below from my eye sight to see the hologram (which should be augmented on the object) in front of me.

I read all issues related to depth in the repository including #37 , #38 , #63 and #64 , but I really have no idea why this problem is happening. Could anyone give an idea why this one is happening and how to solve it? Thank you in advance.

alemarro commented 5 years ago

Have you found any solution to your question? I am also stuck

kaiwu119 commented 3 years ago

@Joon-Jung Have you solved this problem?