X-LANCE / UniCATS-CTX-vec2wav

[AAAI 2024] Code for CTX-vec2wav in UniCATS
https://cpdu.github.io/unicats/
112 stars 16 forks source link

Inference Speed #2

Open rishikksh20 opened 10 months ago

rishikksh20 commented 10 months ago

Hi @cantabile-kwok , I have also implemented UniCATS's vec2wav but that model is too slow, so I am curious to know the inference speed of this model. Actually, I am interested in integrating CTX-vec2wav with GPT-based AR txt2vec to create a fast prompt-based TTS model.

Also, do you have any plan to release CTX-txt2vec model anytime soon? Thanks

cantabile-kwok commented 10 months ago

Hi, thanks for the interest!

From a previous log on GPU, the following speed was reported:

11.94it/s, RTF=0.0106

May I know the speed of your implemented model if it is too slow? For the above speed, I think that should be OK in regular cases.

The CTX-text2vec is a bit harder to open-source, but we will get our hands on it soon (probably will finish in this month, but I can't be 100% certain). Please stay tuned if you are interested : )

cantabile-kwok commented 9 months ago

@rishikksh20 Hi Rishikesh, for your information, the acoustic model part (text2vec) is now finished at https://github.com/cantabile-kwok/UniCATS-CTX-text2vec.

rishikksh20 commented 9 months ago

Thanks @cantabile-kwok I already following that repo. Will check end to end training