lifeiteng / vall-e

PyTorch implementation of VALL-E(Zero-Shot Text-To-Speech), Reproduced Demo https://lifeiteng.github.io/valle/index.html
https://lifeiteng.github.io/valle/index.html
Apache License 2.0
1.99k stars 320 forks source link

Long Inference Time #142

Open debasishaimonk opened 1 year ago

debasishaimonk commented 1 year ago

Valle model is taking very long time to generate voices. is there any ongoing issues or PR being raised to work on it. Has there been any discussion how to speed up?

lifeiteng commented 1 year ago

Implementing kv cache can speedup 10-20x.

debasishaimonk commented 1 year ago

has it been implemented?

RahulBhalley commented 1 year ago

Valle model is taking very long time to generate voices.

How long does it take?

RuntimeRacer commented 1 year ago

I think the Inference times on an RTX 3090 / 4090 are quite acceptable actually. Not perfect, but acceptable. What Hardware are you using?

Implementing kv cache can speedup 10-20x.

@lifeiteng Is this planned to be added to the repo? A speedup by this degree would be incredible.

lifeiteng commented 1 year ago

@RuntimeRacer I don't have time to do it.

RahulBhalley commented 1 year ago

I think the Inference times on an RTX 3090 / 4090 are quite acceptable actually. Not perfect, but acceptable. What Hardware are you using?

I'll probably use 4090. Do you know how much time it takes? I haven't run the code yet. Just exploring my option rn.

bank010 commented 12 months ago

我认为RTX 3090 / 4090上的推理时间实际上是可以接受的。不完美,但可以接受。您使用什么硬件?

实施可以加快 10-20 倍。kv cache

是否计划将其添加到存储库中?这种程度的加速将是不可思议的。

Hello, do you have any experience in this field? It would be incredible if you did.