Nicholasli1995 / EvoSkeleton

Official project website for the CVPR 2020 paper (Oral Presentation) "Cascaded deep monocular 3D human pose estimation wth evolutionary training data"
https://arxiv.org/abs/2006.07778
MIT License
333 stars 44 forks source link

Data Preprocessing when test on 3dhp dataset #75

Closed ZhangRenkai closed 2 years ago

ZhangRenkai commented 2 years ago

After training the model on the H36m dataset, if we need to evaluate on the 3dhp dataset, how to preprocess the 3dhp dataset? Considering that there are differences in the definitions of joint for both datasets (no nose joint in 3dhp), should we delete the nose joint in H36m dataset during the model training process (Input size: 16 2 Output size: 15 3). And there are two types of 3D joints coordinates in 3dhp dataset, which one should we use during evaluating, should we delete the nose joint (16 3), align the keypoints to the root joint and delete the root joint (15 3) ? Should we first resize the scale of estimated 3dhp 2D coordinates and then normalizing the 2D keypoints using H36m stats? And what about the 3dhp 3D keypoints? Are there any other details that need to attention?

Nicholasli1995 commented 2 years ago

After training the model on the H36m dataset, if we need to evaluate on the 3dhp dataset, how to preprocess the 3dhp dataset? Considering that there are differences in the definitions of joint for both datasets (no nose joint in 3dhp), should we delete the nose joint in H36m dataset during the model training process (Input size: 16 2 Output size: 15 3). And there are two types of 3D joints coordinates in 3dhp dataset, which one should we use during evaluating, should we delete the nose joint (16 3), align the keypoints to the root joint and delete the root joint (15 3) ? Should we first resize the scale of estimated 3dhp 2D coordinates and then normalizing the 2D keypoints using H36m stats? And what about the 3dhp 3D keypoints? Are there any other details that need to attention?

The 'gt_3d_univ' version of the 3DHP data was used as ground truth. The official annotation of MPI-INF-3DHP is 0:head 1:thorax 2:right_s 3:right_el 4:right_w 5: left_s 6: left_el 7:left_w 8:right_hip 9:right_knee 10:right_ankle 11:left_hip 12:left_knee 13:left_ankle 14: hip 15: spine 16: neck The map from them to the H36M joints is [14, 8, 9, 10, 11, 12, 13, 15, 1, 16, 0, 5, 6, 7, 2, 3, 4] In the evaluation script, they are mapped, aligned to the root and compared with the predicted 3D joints

Nicholasli1995 commented 2 years ago

If you need, I may push a evaluation script a bit later when I have time.

After training the model on the H36m dataset, if we need to evaluate on the 3dhp dataset, how to preprocess the 3dhp dataset? Considering that there are differences in the definitions of joint for both datasets (no nose joint in 3dhp), should we delete the nose joint in H36m dataset during the model training process (Input size: 16 2 Output size: 15 3). And there are two types of 3D joints coordinates in 3dhp dataset, which one should we use during evaluating, should we delete the nose joint (16 3), align the keypoints to the root joint and delete the root joint (15 3) ? Should we first resize the scale of estimated 3dhp 2D coordinates and then normalizing the 2D keypoints using H36m stats? And what about the 3dhp 3D keypoints? Are there any other details that need to attention?

ZhangRenkai commented 2 years ago

Hello, I just tried to write the evaluation code of the 3dhp dataset, and the results are as follows. image The model is trained with this command line. python 2Dto3Dnet.py -train True -num_stages 3 -num_blocks 3 -twoD_source "synthetic" -evolved_path "generation_6.npy" But the MPJPE error is far away from the paper's result, and I don't know what went wrong. The data processing code are as follows. And there are no other preprocess before feeding the data into the model. And I don't know whether the 3dhp dataset needs to be changed to the camera coordinate system, just like H36m dataset.

def prepare_dataset(opt):

data_dic, data_stats = prepare_data_dict(rcams, 
                                         opt,
                                         predict_14=False
                                         )
annot2, annot3 = get_mpi_data()
annot2, annot3 = normalize_3dhp(annot2, annot3)

action_eval_list = []
eval_dataset = dataset.PoseDataset(annot2, 
                                       annot3, 
                                       'eval', 
                                       refine_3d = opt.refine_3d
                                       )
 # action_eval_list.append(eval_dataset)
action_eval_list = eval_dataset

return train_dataset, eval_dataset, data_stats, action_eval_list

def get_3dhp_normalize_stats():

stats_root = './result/mpi_2d_annot'
mean_2d_path = os.path.join(stats_root, 'data_mean_2d.npy')
data_mean_2d = np.load(mean_2d_path, allow_pickle=True)
std_2d_path = os.path.join(stats_root, 'data_std_2d.npy')
data_std_2d = np.load(std_2d_path, allow_pickle=True)
mean_3d_path = os.path.join(stats_root, 'data_mean_3d.npy')
data_mean_3d = np.load(mean_3d_path, allow_pickle=True)
std_3d_path = os.path.join(stats_root, 'data_std_3d.npy')
data_std_3d = np.load(std_3d_path, allow_pickle=True)
dim_2d_path = os.path.join(stats_root, 'dim_to_use_2d.npy')
dim_to_use_2d = np.load(dim_2d_path, allow_pickle=True)
dim_3d_path = os.path.join(stats_root, 'dim_to_use_3d.npy')
dim_to_use_3d = np.load(dim_3d_path, allow_pickle=True)    

return data_mean_2d, data_std_2d, data_mean_3d, data_std_3d, dim_to_use_2d, dim_to_use_3d

def normalize_3dhp_annot(annot, mean, std, dim_use):

B, J, d = annot.shape

if d == 3:
    annot = remove_root(annot)

annot = annot.reshape(B, -1)
mean = mean[dim_use]
std = std[dim_use]

annot = np.divide( (annot - mean), std )

return annot

def remove_root(annot):

root = annot[:, 0, :]
root = np.expand_dims(root, axis = 1)
root = np.repeat(root, annot.shape[1], axis = 1)
annot = annot - root
annot = annot[:, 1:]

return annot

def normalize_3dhp(annot2, annot3):

data_mean_2d, data_std_2d, data_mean_3d, data_std_3d, dim_to_use_2d, dim_to_use_3d = get_3dhp_normalize_stats()
annot2 = normalize_3dhp_annot(annot2, data_mean_2d, data_std_2d, dim_to_use_2d)
annot3 = normalize_3dhp_annot(annot3, data_mean_3d, data_std_3d, dim_to_use_3d)

return annot2, annot3

def get_mpi_data():

convert_map = [14, 8, 9, 10, 11, 12, 13, 15, 1, 16, 0, 5, 6, 7, 2, 3, 4]

data_root = './data/mpi_3dhp_data'
# save_root = './result/mpi_2d_annot'

annot2_data = []
annot3_data = []

data_names = os.listdir(data_root)
mat_data_names = []
for name in data_names:
    if name.endswith('.mat'):
        mat_data_names.append(name)

for name in mat_data_names:
    data_path = os.path.join(data_root, name)
    data = h5py.File(data_path, 'r')

    annot2 = data['annot2'][:]
    annot3 = data['univ_annot3'][:]
    valid_frame = data['valid_frame'][:, 0]
    bb_crop = data['bb_crop'][:]

    annot2 = annot2[:, 0, convert_map]
    annot3 = annot3[:, 0, convert_map]

    annot2 = annot2[valid_frame == 1]
    annot3 = annot3[valid_frame == 1]

    # try_2d = annot2[0, 0]
    # x = try_2d[:, 0]
    # y = try_2d[:, 1]
    # fig = plt.figure()
    # plt.axis('equal')
    # plt.scatter(x, y)
    # for j in range(len(x)):
    #     plt.text(x[j], y[j], s=str(j))
    # img_save_path = os.path.join(save_root, 'try.jpg')
    # plt.savefig(img_save_path)

    annot2_data.append(annot2)
    annot3_data.append(annot3)

annot2_data = np.array(annot2_data)
annot3_data = np.array(annot3_data)
annot2_data = np.concatenate(annot2_data, axis = 0)
annot3_data = np.concatenate(annot3_data, axis = 0)

return annot2_data, annot3_data