Closed duj12 closed 1 year ago
I noticed that in tlm sample, you used a data/prompt.txt, is this file a sequence of discrete units of laughter?
Yes. If the prompt exists tlm will generate the continuation of the prompt.
Regarding your next question you can just find a laughter sample of the specific laughter you want to synthesize and use it as a prompt to generate it. You can of course also use phoneme to train the model, as long as you have enough labeled data.
I noticed that in tlm sample, you used a data/prompt.txt, is this file a sequence of discrete units of laughter?
I am wondering how can we synthesize specific laughter, such as haahaahaa, hiihiihii, hnnhnnhnn, or heeheehee. If we use phonemes, we can just map the laughter into specific phonemes, and then train a specific TTS model to synthesize the laughter. By your tlm, we can only use a laughter audio to get a discrete units sequence by kmeans clusttering, and then to sample a new sequence with tlm, so we can get a laughter sequence similar to the original audio, and then use this sequence to TTS model to synthesize the laughter.
I don't know if I'm right on the usage of tlm. Or is there another way to synthesize specific laughter we want?