marek-simonik / record3d

Accompanying library for the Record3D iOS app (https://record3d.app/). Allows you to receive RGBD stream from iOS devices with TrueDepth camera(s).
https://record3d.app/
GNU Lesser General Public License v2.1
379 stars 55 forks source link

Possibility to acquire the camera position from the r3d file? #59

Closed t19cs008 closed 1 year ago

t19cs008 commented 1 year ago

Hi,

In my project, I would like to use the record3d app as part of it. I would like to obtain this information: RGB frame, depth frame, and camera position. Therefore, in my use case, I will record the object using the record3d app and save the file in iPad local file. Next, I would like to export the saved file for further processing. However, I have an inquiry about this 3 information, especially regarding the camera position. How would it be possible to acquire this information? Would it be possible to get it through the shareable/internal format (.r3d)?

t19cs008 commented 1 year ago

In addition to the camera position, I would like to acquire the camera direction.

marek-simonik commented 1 year ago

Hi, yes, you can obtain camera pose (position + orientation) from the metadata JSON file found inside the .r3d files. See issue https://github.com/marek-simonik/record3d/issues/27 and https://github.com/marek-simonik/record3d/issues/33#issuecomment-1050097407 to get a description of the format.

t19cs008 commented 1 year ago

Thank you for your reply! I’m sorry this question is not about this app. Could you please show me how to calculate camera position and camera direction from quaternion and world pose? In my project, I use Python3.

t19cs008 commented 1 year ago

I tried to refer https://github.com/marek-simonik/record3d/issues/55#issuecomment-1453090644 or other web site such as https://stackoverflow.com/questions/73506103/how-to-use-quaternions-to-store-the-rotation-of-a-camera etc. And I wrote code to plot camera position and direction. Can I calculate correctly? This is the code viewed from the top down.

# pip install numpy, quaternion, matplotlib
from quaternion import quaternion, as_rotation_matrix
import numpy as np

P = np.asarray([[1, 0, 0, 0], [0, -1, 0, 0], [0, 0, -1, 0], [0, 0, 0, 1]])

def apply_permutation_transform(matrix):
    return P @ matrix @ P.T

def get_mat_from_pose(pose):
    matrix = np.eye(4)
    qx, qy, qz, qw, px, py, pz = pose
    matrix[:3, :3] = as_rotation_matrix(quaternion(qw, qx, qy, qz))
    matrix[:3, -1] = [px, py, pz]
    return matrix

if __name__=="__main__":
    import matplotlib.pyplot as plt
    from matplotlib import patches

    # camera position
    # to plot in 2d, cam is [x.point, z.point] list
    x = []
    y = []
    z = []
    cam = []

    # camera direction
    x_dir = []
    y_dir = []
    z_dir = []

    json_file = "metadata"
    xyz_wrt_cam = np.array([[0.0, 0.0, 0.0, 1]])  # to calculate camera position
    dir_wrt_cam = np.array([[0.0, 0.0, 1, 0.0]])  # to calculate camera direction
    with open(json_file, "r") as f:
        import json
        metadata = json.loads(f.read())

    count = 0
    for q in metadata["poses"]:

        # reduce camera pose
        count += 1
        if count % 10 != 1:
            continue

        mat = get_mat_from_pose(q)
        mat_trf = apply_permutation_transform(mat)

        # calculate camera position
        obj_pts = (mat_trf @ xyz_wrt_cam.T)[:3]

        # store result of camera position
        x.append(obj_pts[0].item())
        y.append(obj_pts[1].item())
        z.append(obj_pts[2].item())
        cam.append([obj_pts[0].item(), obj_pts[2].item()])

        # calculate camera direction
        obj_dir = (mat_trf @ dir_wrt_cam.T)[:3]

        # store result of camera direction
        x_dir.append(obj_dir[0].item())
        y_dir.append(obj_dir[1].item())
        z_dir.append(obj_dir[2].item())

    colors = ["green", "red", "cyan", "magenta", "yellow"]
    fig, ax = plt.subplots()
    ax.scatter(x, z, s = 0.4)
    for i, ca in enumerate(cam):
        ax.add_patch(patches.Circle(xy = ca, radius = 0.001, fill=False, color = colors[i % len(colors)]))
        ax.arrow(x = x[i], y = z[i], dx = x_dir[i], dy = z_dir[i])
    plt.xlabel("X")
    plt.ylabel("Y")
    plt.title("Camera Points")
    plt.show()

    exit(0)
marek-simonik commented 1 year ago

I modified the code a bit; now it should produce the positions of the cameras in the world space and the directions of the cameras in the world space (the direction of the negative Z axis).

from pyquaternion import Quaternion
import numpy as np
import json

def get_mat_from_pose(pose):
    matrix = np.eye(4)
    qx, qy, qz, qw, px, py, pz = pose
    matrix[:3, :3] = Quaternion(qw, qx, qy, qz).rotation_matrix
    matrix[:3, -1] = [px, py, pz]
    return matrix

if __name__ == '__main__':
    json_file = 'metadata'

    with open(json_file, 'rt') as f:
        metadata = json.loads(f.read())

    positions_world = []
    directions_world = []
    count = 0
    for q in metadata['poses']:

        # reduce camera pose
        count += 1
        if count % 10 != 1:
            continue

        mat = get_mat_from_pose(q)
        curr_pos = mat[:3, -1]
        curr_dir = -mat[:3, 2]
        positions_world.append(curr_pos)
        directions_world.append(curr_dir)

    positions_world = np.array(positions_world)
    directions_world = np.array(directions_world)
t19cs008 commented 1 year ago

Thank you very much for correcting the code! Just to double confirm my understanding, By the direction of the negative Z axis, does it mean a right-handed coordinate system?

marek-simonik commented 1 year ago

…does it mean a right-handed coordinate system?

Yes, that is correct. Record3D uses the same coordinate system as OpenGL.

I think your original answer has been answered, so I am closing this issue. But feel free to reopen it if you encounter any problems.