Evaluate the quantitative performance on the Dataset

UttaranB127 / speech2affective_gestures

This is the official implementation of the paper "Speech2AffectiveGestures: Synthesizing Co-Speech Gestures with Generative Adversarial Affective Expression Learning".

https://gamma.umd.edu/s2ag/

MIT License

44 stars 9 forks source link

Evaluate the quantitative performance on the Dataset #18

Open Shedima opened 2 years ago

Shedima commented 2 years ago

How would you evaluate the quantitative performance of your model on the genea_challenge_2020 dataset? I only found the code for evaluation on the TED dataset.

UttaranB127 commented 2 years ago

You can find the method generate_gestures_by_dataset in processor_v2.py, which provides the generation of the GENEA dataset in addition to TED. The quantitative metrics for GENEA were evaluated by hand following the code inside generate_gestures in processor_v2.py.

Shedima commented 2 years ago

Do you mean to use generate_gestures_by_dataset to generate a sequence of poses and then evaluate quantitative metrics manually with generate_gestures?

UttaranB127 commented 2 years ago

Yes

Shedima commented 2 years ago

Hello, I followed the previously stated method to evaluate on the GENEA dataset and found that the FGD evaluation metrics are very high. According to the method in the source code, I generated the corresponding video and found that the generated one is completely different from the real pose. I guess this is the reason for the high FGD evaluation metrics. So can you please provide a complete methodology to evaluate it on the GENEA dataset. 2022-08-04 20-09-22 的屏幕截图

UttaranB127 commented 2 years ago

One thing I notice is that the arms in the GT seem to be vertically inverted (along the y-axis). I think that the evaluation is adding some vertical flipping for both the GT and the predicted, but it might not be required for the GT. Could you try to evaluate and visualize by inverting back the y-axis values of the GT? Apart from that, we had used the same error terms to evaluate on GENEA as on the TED dataset.

Shedima commented 2 years ago

I also found that the arm in GT is inverted vertically, I tried to reverse the value of the Y axis but found that the FGD is still very high. Although I am following the evaluation and visualization in the source code you provided. I can't get the evaluation data in your paper on GENEA dataset, so can you provide the source code that you evaluated on GENEA dataset?

UttaranB127 commented 2 years ago

My apologies, but I currently don't have access to that evaluation code, not sure how soon I will be able to retrieve and access it. Meanwhile, you can report the higher numbers if you followed the same evaluation methodology as for the TED dataset.