Doubiiu / CodeTalker

[CVPR 2023] CodeTalker: Speech-Driven 3D Facial Animation with Discrete Motion Prior
MIT License
515 stars 57 forks source link

Code is not present for lip-distance calculations #55

Closed ketyi closed 1 year ago

ketyi commented 1 year ago

Hi,

Great work! Could you elaborate on lip-distance calculation details please?

image

Doubiiu commented 1 year ago

Hi, it is easy to get this. Generally, you need to get the index of the upper lip point and the lower one (e.g., using MeshLab). Then just need to calculate the L2 distance of this point across the frame and then draw them using your own visualization tools or library. I provide you a code snippet for calculating this lip distance on VOCASET (modify it based on your own path or file name):

import os
import numpy as np

sentence = <sentence name> ## e.g. 'FaceTalk_170809_00138_TA_sentence21.npy'
ours = <path to saved .npy file>
ablation_IN = <path to saved .npy file>
gt = <path to .npy of the dataset>
output_dir = <output path>
vertice_dim = 15069 ## vocaset
upper_pos = 3546 ## vocaset
lower_pos = 3504 ## vocaset, original 2151 is not correct
distance = []
for idx, interpolation_npys_dir in enumerate([ours, ablation_IN, gt]):
    if idx == 2: ## As GT data is with different format
        predicted_vertices = np.load(os.path.join(interpolation_npys_dir, sentence))
        predicted_vertices = np.reshape(predicted_vertices,(-1,vertice_dim//3,3))[::2,...] # [N, V, 3]
    else:
        predicted_vertices = np.load(os.path.join(interpolation_npys_dir, <your own file name when generating and saving this sentence of your model>))
        predicted_vertices = np.reshape(predicted_vertices,(-1,vertice_dim//3,3))

    pos_upper = predicted_vertices[:108,upper_pos,:] # [N, 3]
    pos_lower = predicted_vertices[:108,lower_pos,:] # [N, 3]
    distance.append(np.sqrt(np.sum(np.square(pos_upper-pos_lower), axis=1))) # each is a N vector showing L2 distance

distance = np.array(distance)

np.savetxt(os.path.join('ablation_IN_lip_distance.csv'), distance, delimiter=",")
ketyi commented 1 year ago

Thank you, @Doubiiu

bullgokman commented 8 months ago

Why use 2151 for 'lower_pos'? When i check VOCASET in blender, vertex of 2151 is located in neck backside. Is there any other reason for using 2151 for lower lip?

Doubiiu commented 8 months ago

Hi @bullgokman Yeap. You are correct. 3504 should be correct. Although using 2151 can also reflect the lip movement to some extent, 3504 should be more suitable. Thanks for your reminder.