apple / ARKitScenes

This repo accompanies the research paper, ARKitScenes - A Diverse Real-World Dataset for 3D Indoor Scene Understanding Using Mobile RGB-D Data and contains the data, scripts to visualize and process assets, and training code described in our paper.
Other
662 stars 58 forks source link

Rotated videos #47

Closed mahtabbigverdi closed 1 year ago

mahtabbigverdi commented 1 year ago

Hi, Great dataset! Some of the videos are flipped or 90-degree rotated. Are there any tags to detect this kind of videos and rotate them back? thanks

Levintsky commented 1 year ago

Hi, great question! Yeah, the image will always be a fixed orientation for both landscape and portrait style. One way I used to make it rectified is: since after unprojecting to 3D world coordinate, the 3rd axis is always z-axis. So, we can decide the pose like this:

def decide_pose(pose): """ Args: pose: np.array (4, 4) Returns: index: int (0, 1, 2, 3) for upright, left, upside-down and right """

pose style

z_vec = pose[2, :3]
z_orien = np.array(
    [
        [0.0, -1.0, 0.0],  # upright
        [-1.0, 0.0, 0.0],  # left
        [0.0, 1.0, 0.0],  # upside-down
        [1.0, 0.0, 0.0],
    ]  # right
)
corr = np.matmul(z_orien, z_vec)
corr_max = np.argmax(corr)
return corr_max

and rotate the image by 0/90/180/270 degree clockwise if the rotation index is 0/1/2/3.

mahtabbigverdi commented 1 year ago

Hi, Thank you for the quick response. The issue is some videos are rotated for 90 degrees like video 41097980, and some look OK like 47333462. For each video, there is no pose matrix in the corresponding folders just a .traj file. Also, for each time stamp there is a different location( 3 angles) for the camera. So 1. How I can create a pose matrix from those 3 rotation values? and 2. which pose matrix from which time stamp I should use as an input to the decide_pose function you provided?

afshindn commented 1 year ago

Please refer to the script here for rectifying images:threedod/benchmark_scripts/rectify_im.py