lifeiteng / vall-e

PyTorch implementation of VALL-E(Zero-Shot Text-To-Speech), Reproduced Demo https://lifeiteng.github.io/valle/index.html
https://lifeiteng.github.io/valle/index.html
Apache License 2.0
1.99k stars 320 forks source link

question about -- prefix-mode #144

Closed jzssz closed 9 months ago

jzssz commented 1 year ago
  1. What is the specific meaning of -- prefix-mode ? I learned "0: no prefix, 1: 0 to random, 2: random to random, 4: chunk of pre or post utterance." But I couldn't understand what it meant.
  2. And does it have to be set the same when train and infer?
  3. When I train ar, can I choose 0, 1, 2, 4? When I train nar, can I choose 0, 1, 2, 4? Must train ar and nar select the same value? thanks much !
chenjiasheng commented 1 year ago

--prefix-mode has nothing to do with AR. For NAR, you can always use --prefix-mode=1 for simplicity.