jasonppy / VoiceCraft

Zero-Shot Speech Editing and Text-to-Speech in the Wild
Other
7.64k stars 746 forks source link

about silence tokens during inference #145

Open thivux opened 4 months ago

thivux commented 4 months ago

i see that the default values for silence_tokens during inference are [1388,1898,131]. my questions:

  1. why is there more than one silence token?
  2. how do silence_tokens differ from the <SIL> phoneme in vocab.txt?
  3. how can i find the silence tokens when training on my own dataset?