marek-simonik / record3d

Accompanying library for the Record3D iOS app (https://record3d.app/). Allows you to receive RGBD stream from iOS devices with TrueDepth camera(s).
https://record3d.app/
GNU Lesser General Public License v2.1
383 stars 57 forks source link

How can modify initPose? #95

Open jeonhuhuhu opened 1 week ago

jeonhuhuhu commented 1 week ago

I am modifying the demo-main.py code on GitHub to create a .r3d data format generated from the Record3 app results.

[Question#1] The photo on the left is the metadata created in the Record3D app, and the photo on the right is the metadata JSON output value extracted from the demo-main.py code.

However, in the app, the initPose value is output as 0,0,0,1,0,0,0, but in the demo-main.py code,

Quaternion (qx, qy, qz, qw) + Position (tx, ty, tz) = 7 value

it is output as the result on the right.

What is the reason? Is it a problem with the output data type or what?

[Question#2] What I'm curious about in the code is that if Depth data is created, will conf data be created based on Depth? Or is .conf created individually, different from .depth, depending on the camera's depth information?

gnikoloff commented 1 week ago

Not the package author, but want to mention that the conf maps are generated independently from the depth in Apple APIs and more specifically ARKit. Confidence and depth are given separately while recording via a .confidenceMap and a .depthMap respectively.

In other words the .conf and .depth files you extract from the R3D files are generated independently and their bytes are not "connected" or related.

jeonhuhuhu commented 1 week ago

Not the package author, but want to mention that the conf maps are generated independently from the depth in Apple APIs and more specifically ARKit. Confidence and depth are given separately while recording via a .confidenceMap and a .depthMap respectively.

In other words the .conf and .depth files you extract from the R3D files are generated independently and their bytes are not "connected" or related.

Thank you for the quick reply!!!

Additionally, what I'm curious about is that in artificial intelligence algorithms, there are depth estimation techniques that can extract depth information by receiving an image (.JPG) as input.

I wonder if it is possible to make the .Depth and .Confidence data output from the Recrod3D app the same by receiving only an image as input using an iPad.

marek-simonik commented 6 days ago

Apologies for the very late reply. Since your Question#2 has already been answered by @gnikoloff, let me answer the rest.

initPose

All poses stored in the poses array are expressed relative to the initPose. That's it, there is no more to it. You can choose arbitrary pose to be the initPose; it doesn't even have to be one of the poses you got during USB streaming.

The easiest thing you can do, is to save the identity transformation ([0, 0, 0, 1, 0, 0, 0]) as the initPose — like Record3D does — and simply store all the poses you obtained during USB streaming into the poses array.

Alternatively, you could choose e.g. the first frame's pose as the initPose and transform the poses of all the subsequent frames so that they are expressed relative to the first frame's pose. So posesArrayInMetadata[i] = inverse(usbFramePoses[0]) . usbFramePoses[i].

Depth estimation from RGB image

This is out of the scope of what Record3D aims to do. However, you might want to take a look at Apple's recently released Depth Pro repo, which performs depth estimation from color images, although without confidence maps.