landmark inaccuracy on the RAVDESS dataset

wuhaozhe commented 4 years ago

Type your opinions or ideas here.

Hi, this project is really very useful for several downstream tasks. Currently, I'm utilizing 3DFFA_V2 to reconstruct some talking faces on the RAVDESS dataset. This is a very neat in-lab dataset, which has a high-resolution head with white background. However, the reconstruction accuracy seems not to be good, several landmarks on lips are not aligned. Here are some examples this is the original image this is the reconstructed image Obviously, the lip in the original image is closed, but in the reconstructed image is opened. I'm wondering whether should I adjust some parameters when conducting 3D reconstruction on videos?

cleardusk commented 3 years ago

I ran the image with mobilenetv1 and resnet22, the results seem better than yours. Since the input is resized to 120x120, the model may not perform well on high-resolution images, e.g. 4k resolution. If you prefer more accurate results, you may need more accurate data (diverse, high resolution images) and a better 3DMM-like model (better expression). Better data and a better model base give better results.

The result run by ResNet backbone. hardcase_3d_resnet

BTW, how to utilize the unsupervised (unlabeled) high-resolution data is also an interesting researching problem.

wuhaozhe commented 3 years ago

Thanks!!

cleardusk / 3DDFA_V2

landmark inaccuracy on the RAVDESS dataset #30