junhwanjang / visemenet-inference

3D Avatar Lip Synchronization from speech (JALI based face-rigging)
Apache License 2.0
72 stars 15 forks source link

Output File Format Documentation? #1

Open jasondalycan opened 1 year ago

jasondalycan commented 1 year ago

Hi, Is there documentation on how to read the output file format - i.e. what each column means?

fadiaburaid commented 1 year ago

Hello, Did you manage to find any documentation on what each column represents?

fadiaburaid commented 1 year ago

I have managed to find from original VisemeNet_tensorflow repo that each column represents a Maya viseme for JALI face rig arranged as follow. ['Jaw', 'Lip', 'Ah', 'Aa', 'Eh', 'Ee', 'Ih', 'Oh', 'Uh', 'U', 'Eu', 'Schwa', 'R', 'S', 'Sh Ch Zh', 'Th', 'JY', 'LNTD', 'GK', 'MBP', 'FV', 'WA_PEDAL']

Also according to the paper "The features are extracted every 10 ms, or in other words feature extraction is performed at a. 100 FPS rate"

onehundredfeet commented 1 year ago

The frame rate can be changed in the inference.py

fadiaburaid commented 1 year ago

The frame rate can be changed in the inference.py

Great I can see how it can be changed now. Thank you