dkadish / pynktrombone

Python port of the Pink Trombone JS code
11 stars 6 forks source link

synthesizing voiceless consonants by replacing the glottal flow #2

Open Yashish92 opened 2 years ago

Yashish92 commented 2 years ago

I came across your repository when I was looking for a python implementation of a vocal tract model and found it to be really useful. I came across the pink trombone a while back and always wanted to use it in one of my machine learning models. I started playing around with your repo generating some vowel sounds. But I was not able to synthesize any consonant sounds which does not have voicing (glottal activity) and was trying to see if there is a way to replace the glottal flow with a noise flow. Is it already implemented somewhere in the library and if so would really appreciate your help in pointing that out.

On a side note I would also like to know your thoughts on this repository and if this would be a long shot to use it to synthesize continuous speech or at least some meaningful words. Thanks !!

Yashish92 commented 2 years ago

I just now figured out that setting the 'tenseness' parameter to zero results in voiceless consonant sounds which makes sense. Let me know if that's the best way to go about this or if you have any other better suggestions. Thanks !!

dkadish commented 2 years ago

I think you should be able to generate and play continuous speech as long as your machine is powerful enough (I don't think this works on a Raspberry Pi, for example, but any modern laptop should be able to handle it if I remember correctly).

I'm not sure about replacing the flows. I built this a while back and it was a pretty naive attempt to reimplement the work that was done in both PinkTrombone (https://github.com/jamesstaub/pink-trombone-osc) and the C++ port called VOC (https://github.com/PaulBatchelor/voc). But it sounds like you've figured something out for creating consonant sounds.

I've had some success evolving neural networks to generate (non-speech) sound using this code, but it should be possible to control it to produce something that sounds speech-like.

Let me know if you have any other specific questions. I'm curious to see what you end up making!