does AR model need audio_prompt?

lifeiteng / vall-e

PyTorch implementation of VALL-E(Zero-Shot Text-To-Speech), Reproduced Demo https://lifeiteng.github.io/valle/index.html

https://lifeiteng.github.io/valle/index.html

Apache License 2.0

1.99k stars 320 forks source link

Closed sherryxie1 closed 1 year ago

sherryxie1 commented 1 year ago

i can't find audio_prompt input for AR model，but it seems helpful in paper, is there any reason to not use audio_prompt?

lifeiteng commented 1 year ago

you shoud read the paper again. x_0 ... x_t are the prompts of x_{t+1} ...