karkirowle / articulatory_inversion

DBLSTM baseline for articulatory inversion
2 stars 0 forks source link

question #2

Closed qq957606954 closed 3 years ago

qq957606954 commented 3 years ago

can I get your data? thanks!

karkirowle commented 3 years ago

Hey! You need to ask for a login here for the MNGU0 dataset because it is controlled access: http://mngu0.org/ Let me know if you struggle with reproduction, I can look into it and make some changes.

qq957606954 commented 3 years ago

Thanks! Recently I do a research about changing articulatory to acoustic using the EMA data. Do you have some ideas?

karkirowle commented 3 years ago

This repository focuses on acoutic to articulatory inversion, but there are some papers doing the other way around, i.e most recently I've seen an attempt to use WaveNet as a model (https://arxiv.org/abs/2006.12594). In general, I think some attention-based architectures would be the most promising try, but the sequences are too long, so a fast attention variant has to be used/some clever downsampling scheme. Also, I doubt that EMA actually contains enough data for speech synthesis, rtMRI-based techniques give a better view but they have too low sampling rate (<200 Hz), so the holy grail is yet to come for this field, I feel.

qq957606954 commented 3 years ago

well, Thanks!