Closed 152334H closed 1 year ago
@lifeiteng Thanks for the great work! Any plan to implement vallex?
hands on it now.
Hi, @lifeiteng I have tried the vall-ex based on your code. But the modification only on adding the language ID and using the BigCiDian mentioned in paper as phoneme input.
I utilized LibriTTS and AISHELL3 as data, and below is the preliminary result. Chinese speaker Audio prompt: link
Text: 托管费在基金管理费之外征收 Synthesized Audio: link Text: This I read with great attention while they sat silent. Synthesized Audio: link
I am not sure if I do it correctly, Maybe I can discuss it with you? This is my email: mail
Hi, @lifeiteng I have tried the vall-ex based on your code. But the modification only on adding the language ID and using the BigCiDian mentioned in paper as phoneme input.
I utilized LibriTTS and AISHELL3 as data, and below is the preliminary result. Chinese speaker Audio prompt: link
Text: 托管费在基金管理费之外征收 Synthesized Audio: link Text: This I read with great attention while they sat silent. Synthesized Audio: link
I am not sure if I do it correctly, Maybe I can discuss it with you? This is my email: mail
hi, the link to the first audio you synthesized using the prompt seems to be wrong, and now it's still that prompt.
朋友,中文合成的第一条链接好像是错误的,打开还是那个prompt的文件。😂
@lifeiteng Hey thanks for the efforts! Any update on VALL-E X?
Hi, @lifeiteng I have tried the vall-ex based on your code. But the modification only on adding the language ID and using the BigCiDian mentioned in paper as phoneme input.
I utilized LibriTTS and AISHELL3 as data, and below is the preliminary result. Chinese speaker Audio prompt: link
Text: 托管费在基金管理费之外征收 Synthesized Audio: link Text: This I read with great attention while they sat silent. Synthesized Audio: link
I am not sure if I do it correctly, Maybe I can discuss it with you? This is my email: mail
same to me, my solution is same as you. it seems works
Hi, @lifeiteng I have tried the vall-ex based on your code. But the modification only on adding the language ID and using the BigCiDian mentioned in paper as phoneme input. I utilized LibriTTS and AISHELL3 as data, and below is the preliminary result. Chinese speaker Audio prompt: link Text: 托管费在基金管理费之外征收 Synthesized Audio: link Text: This I read with great attention while they sat silent. Synthesized Audio: link I am not sure if I do it correctly, Maybe I can discuss it with you? This is my email: mail
same to me, my solution is same as you. it seems works
अपने VALLE-X को कैसे इंप्लीमेंट किया?
maybe you can refer to my e-x branch forked from here
आपके रेपॉजिटरी में VALLE-X से संबंधित कोई कोड अपडेट दिखाई नहीं दे रहा है। @skysbird
maybe you can refer to my e-x branch forked from here
@skysbird How do you handle the part of Language ID? I saw your code, but it seems like you follow the architecture of origin Vall-E?
maybe you can refer to my e-x branch forked from here
@skysbird How do you handle the part of Language ID? I saw your code, but it seems like you follow the architecture of origin Vall-E?
I have embedded language id
maybe you can refer to my e-x branch forked from here
@skysbird How do you handle the part of Language ID? I saw your code, but it seems like you follow the architecture of origin Vall-E?
I have embedded language id
Can you please drop a link to that code statement?
Nvm, found it.
So is there any update on implementing VALL-E X?
https://vallex-demo.github.io/
Title