Sound data from Acapela?

TechnoX commented 3 years ago

Are the data generated from the Acapela TTS available? I would like to get access to the visemes for animating the lips on the robot.

As I understand the Acapela TTS engine generates both the phonemes and the visemes, and they are available in a struct. See for example the section about Acapela NSCAPI here https://wolfpaulus.com/lipsynchronization/ See for example the NSC_EVENT_DATA_PhoSynch struct.

Even time-stamps from when each letter / word is pronounced would be very useful.

In case the data is not available, can I access the raw sound data generated by Acapela and do my own phoneme / viseme recognition from that data?

apaikan commented 3 years ago

That's interesting feature request. For the time being, no. the only output you can find from Acapela is the generated voice file. However, we can take look into that and see how the callbacks/phonetics can be exposed via ROS.

TechnoX commented 3 years ago

Thanks! Keep me informed! The data should be available somehow so there is only a wrapper that needs to be written. But I know that it still takes time to do :)

TechnoX commented 3 years ago

You wrote that you can find the generated voice file, is it possible to get access to it? How?

apaikan commented 3 years ago

Hi @TechnoX right, you can find generated .wav file in /tmp folder. The interface overwrite the wav file every time you call the speech service/topic.

luxai-qtrobot / QA

Sound data from Acapela? #45