lifeiteng / vall-e

PyTorch implementation of VALL-E(Zero-Shot Text-To-Speech), Reproduced Demo https://lifeiteng.github.io/valle/index.html
https://lifeiteng.github.io/valle/index.html
Apache License 2.0
2.01k stars 319 forks source link

Support for VALL-E X #52

Closed 152334H closed 1 year ago

152334H commented 1 year ago

https://vallex-demo.github.io/

Title

takan1 commented 1 year ago

@lifeiteng Thanks for the great work! Any plan to implement vallex?

lifeiteng commented 1 year ago

hands on it now.

Li-JEN commented 1 year ago

Hi, @lifeiteng I have tried the vall-ex based on your code. But the modification only on adding the language ID and using the BigCiDian mentioned in paper as phoneme input.

I utilized LibriTTS and AISHELL3 as data, and below is the preliminary result. Chinese speaker Audio prompt: link

Text: 托管费在基金管理费之外征收 Synthesized Audio: link Text: This I read with great attention while they sat silent. Synthesized Audio: link

I am not sure if I do it correctly, Maybe I can discuss it with you? This is my email: mail

misakaikato commented 1 year ago

Hi, @lifeiteng I have tried the vall-ex based on your code. But the modification only on adding the language ID and using the BigCiDian mentioned in paper as phoneme input.

I utilized LibriTTS and AISHELL3 as data, and below is the preliminary result. Chinese speaker Audio prompt: link

Text: 托管费在基金管理费之外征收 Synthesized Audio: link Text: This I read with great attention while they sat silent. Synthesized Audio: link

I am not sure if I do it correctly, Maybe I can discuss it with you? This is my email: mail

hi, the link to the first audio you synthesized using the prompt seems to be wrong, and now it's still that prompt.

朋友,中文合成的第一条链接好像是错误的,打开还是那个prompt的文件。😂

thomasbohm commented 1 year ago

@lifeiteng Hey thanks for the efforts! Any update on VALL-E X?

skysbird commented 1 year ago

Hi, @lifeiteng I have tried the vall-ex based on your code. But the modification only on adding the language ID and using the BigCiDian mentioned in paper as phoneme input.

I utilized LibriTTS and AISHELL3 as data, and below is the preliminary result. Chinese speaker Audio prompt: link

Text: 托管费在基金管理费之外征收 Synthesized Audio: link Text: This I read with great attention while they sat silent. Synthesized Audio: link

I am not sure if I do it correctly, Maybe I can discuss it with you? This is my email: mail

same to me, my solution is same as you. it seems works

RahulBhalley commented 1 year ago

Hi, @lifeiteng I have tried the vall-ex based on your code. But the modification only on adding the language ID and using the BigCiDian mentioned in paper as phoneme input. I utilized LibriTTS and AISHELL3 as data, and below is the preliminary result. Chinese speaker Audio prompt: link Text: 托管费在基金管理费之外征收 Synthesized Audio: link Text: This I read with great attention while they sat silent. Synthesized Audio: link I am not sure if I do it correctly, Maybe I can discuss it with you? This is my email: mail

same to me, my solution is same as you. it seems works

अपने VALLE-X को कैसे इंप्लीमेंट किया?

skysbird commented 1 year ago

maybe you can refer to my e-x branch forked from here

RahulBhalley commented 1 year ago

आपके रेपॉजिटरी में VALLE-X से संबंधित कोई कोड अपडेट दिखाई नहीं दे रहा है। @skysbird

Li-JEN commented 1 year ago

maybe you can refer to my e-x branch forked from here

@skysbird How do you handle the part of Language ID? I saw your code, but it seems like you follow the architecture of origin Vall-E?

skysbird commented 1 year ago

maybe you can refer to my e-x branch forked from here

@skysbird How do you handle the part of Language ID? I saw your code, but it seems like you follow the architecture of origin Vall-E?

I have embedded language id

RahulBhalley commented 1 year ago

maybe you can refer to my e-x branch forked from here

@skysbird How do you handle the part of Language ID? I saw your code, but it seems like you follow the architecture of origin Vall-E?

I have embedded language id

Can you please drop a link to that code statement?

Nvm, found it.

treya-lin commented 11 months ago

So is there any update on implementing VALL-E X?