Open yesbroc opened 8 months ago
Yeah, i was actually kind of planning to add something for that, a setting that lets you choose between using the same history over and over, and looping the history back.
Since putting the same clip over and over causes inconsistencies with emotions etc. While looping it back has issues with it gaining noise, but having a more realistic, but also slowly degrading output.
I'm wondering if there's a way I could get the best of both worlds, like some denoiser to improve the audio quality throughout loopbacks, which would allow it to be consistent, but not lose the quality.
what does 'history' mean in this context?
Bark uses a system called "history prompts" which are basically context for the language model, It allows it to retain the same voice. These history prompts are stored in .npz files, containing 3 .npy files. The coarse, fine and semantic prompts.
ahh ok, what if for the whole prompt, they use the first sentence as history, then the rest of the paragraph uses that first sentence's history.
or the webui can detect whether a "[]" pops up and treats that as it's own history, which it uses for that one sentence.
Custom formatting for controlling the prompts could be useful. I'll think about it.
is there also a way to use multiple voices in one prompt?
Not currently but custom formatting could make it possible
Basically, because quality degrades the longer the input is, making continuations from worsening quality outputs will severely impact the overall output. I've found this to be true while using Strict long and short.
I believe stitching together multiple, independent outputs would keep the output quality consistent rather than continuing from degrading outputs.