zc-alexfan / arctic

[CVPR 2023] Official repository for downloading, processing, visualizing, and training models on the ARCTIC dataset.
https://arctic.is.tue.mpg.de
Other
295 stars 18 forks source link

Distortion for egocentric viewpoints #51

Closed mks0601 closed 3 weeks ago

mks0601 commented 4 weeks ago

Hi Alex, how are you doing? I'd like to ask you about the distortion for the egocentric viewpoints. I'm trying to visualize projected vertices of SMPL-X mesh to the egocentric viewpoints. However, it seems vertices are projected to a wrong positions although I filtered points whose depth values are negative. Could you check the below code? You can simply run it after setting arctic_root_path and smplx_root_path and run python test.py. When I run below code, I got below visualized results. It seems some vertices are projected to a weird position. I used the same distortion function as yours (https://github.com/zc-alexfan/arctic/blob/9f5770966350c66d8bf0ac3fd4cfde74434a109b/common/transforms.py#L82)

import os.path as osp
from glob import glob
import numpy as np
import cv2
import torch
import json
import smplx
from pytorch3d.io import load_obj, save_obj

# This function is from https://github.com/zc-alexfan/arctic/blob/9f5770966350c66d8bf0ac3fd4cfde74434a109b/common/transforms.py#L82
def distort_pts3d_all(_pts_cam, dist_coeffs):
    # egocentric cameras commonly has heavy distortion
    # this function transform points in the undistorted camera coord
    # to distorted camera coord such that the 2d projection can match the pixels.
    pts_cam = _pts_cam.clone().double()
    z = pts_cam[:, :, 2]
    is_valid = z > 1e-4
    z_inv = 1 / z

    x1 = pts_cam[:, :, 0] * z_inv
    y1 = pts_cam[:, :, 1] * z_inv

    # precalculations
    x1_2 = x1 * x1
    y1_2 = y1 * y1
    x1_y1 = x1 * y1
    r2 = x1_2 + y1_2
    r4 = r2 * r2
    r6 = r4 * r2

    r_dist = (1 + dist_coeffs[0] * r2 + dist_coeffs[1] * r4 + dist_coeffs[4] * r6) / (
        1 + dist_coeffs[5] * r2 + dist_coeffs[6] * r4 + dist_coeffs[7] * r6
    )

    # full (rational + tangential) distortion
    x2 = x1 * r_dist + 2 * dist_coeffs[2] * x1_y1 + dist_coeffs[3] * (r2 + 2 * x1_2)
    y2 = y1 * r_dist + 2 * dist_coeffs[3] * x1_y1 + dist_coeffs[2] * (r2 + 2 * y1_2)
    # denormalize for projection (which is a linear operation)
    cam_pts_dist = torch.stack([x2 * z, y2 * z, z], dim=2).float()
    return cam_pts_dist, is_valid

# path
arctic_root_path = '/data/ARCTIC/arctic/unpack/arctic_data/data' # there are 'images', 'meta', 'raw_seqs', and 'splits_json' folders in this directory
smplx_root_path = '/home/mks0601/workspace/human_model_files' # there is a 'smplx' folder in this directory
subject_name = 's01'
seq_name = 'box_grab_01'
cam_name = '0'
frame_idx = 70

# load files
with open(osp.join(arctic_root_path, 'meta', 'misc.json')) as f:
    db_info = json.load(f)
ego_cam_param = np.load(osp.join(arctic_root_path, 'raw_seqs', subject_name, seq_name + '.egocam.dist.npy'), allow_pickle=True)[()]
smplx_params = np.load(osp.join(arctic_root_path, 'raw_seqs', subject_name, seq_name + '.smplx.npy'), allow_pickle=True)[()]
img = cv2.imread(osp.join(arctic_root_path, 'images', subject_name, seq_name, cam_name, '%05d.jpg' % frame_idx))
v_template, _, _ = load_obj(osp.join(arctic_root_path, 'meta', 'subject_vtemplates', subject_name + '.obj'))
gender = db_info[subject_name]['gender']
smplx_layer = smplx.create(smplx_root_path, 'smplx', gender=gender, use_pca=False, flat_hand_mean=True, v_template=v_template)

# camera parameter
offset = db_info[subject_name]['ioi_offset']
frame_idx_offset = frame_idx - offset
cam_param = {'R': ego_cam_param['R_k_cam_np'][frame_idx_offset], \
            't': ego_cam_param['T_k_cam_np'][frame_idx_offset], \
            'focal': np.array([ego_cam_param['intrinsics'][0][0], ego_cam_param['intrinsics'][1][1]], dtype=np.float32), \
            'princpt': np.array([ego_cam_param['intrinsics'][0][2], ego_cam_param['intrinsics'][1][2]], dtype=np.float32), \
            'distortion': ego_cam_param['dist8']}

# get smplx vertices
smplx_param = {k: torch.FloatTensor(v[frame_idx_offset]).view(1,-1) for k,v in smplx_params.items()} # 'transl', 'global_orient', 'body_pose', 'jaw_pose', 'leye_pose', 'reye_pose', 'left_hand_pose', 'right_hand_pose'
output = smplx_layer(global_orient=smplx_param['global_orient'], body_pose=smplx_param['body_pose'], jaw_pose=smplx_param['jaw_pose'], leye_pose=smplx_param['leye_pose'], reye_pose=smplx_param['reye_pose'], left_hand_pose=smplx_param['left_hand_pose'], right_hand_pose=smplx_param['right_hand_pose'], transl=smplx_param['transl'])
xyz = output.vertices[0].detach().numpy() # world coordinate
xyz = np.dot(cam_param['R'], xyz.transpose(1,0)).transpose(1,0) + cam_param['t'].reshape(1,3) # camera coordinate
xyz, is_valid = distort_pts3d_all(torch.FloatTensor(xyz[None]), cam_param['distortion']) # distorted camera coordinate
xyz, is_valid = xyz[0].numpy(), is_valid[0].numpy()
x = xyz[:,0] / xyz[:,2] * cam_param['focal'][0] + cam_param['princpt'][0] # image coordinate
y = xyz[:,1] / xyz[:,2] * cam_param['focal'][1] + cam_param['princpt'][1] # image coordinate

# visualize
for i in range(len(x)):
    if is_valid[i]:
        img = cv2.circle(img, (int(x[i]), int(y[i])), 3, (255,0,0), -1)
cv2.imwrite(subject_name + '_' + seq_name + '_' + cam_name + '_' + str(frame_idx) + '.jpg', img)

s01_box_grab_01_0_70

mks0601 commented 4 weeks ago

FYI, if I project the camera coordinates to image space without using the distortion function, there is no such problem.

zc-alexfan commented 3 weeks ago

Hi Gyeongsik,

The main reason is that hands and objects in egocentric view usually are very close to the camera. Therefore, there are more distortion in the pixel space. To address this, we uses "vertex displacement" to "correct" points in 3D using the distortion parameters such that they have better 2D overlay. However, this approach assumes the points being distorted in 3D are not very close to the camera. This is not the case for SMPLX (the head of SMPLX).

You can ignore the distort_pts3d_all function but it won't take distortion into consideration. Having said that vertex displacement is probably not the most suitable for SMPLX use-case, I am also curious in case you know any other solutions (maybe check out AssemblyHands as they have more distorted images or put the distortion parameters into the renderer to render properly).

mks0601 commented 3 weeks ago

Awesome. Thanks for your check!