JarodMica / audiobook_maker

GNU General Public License v3.0
203 stars 34 forks source link

Text parsing enhanced #47

Closed shakenbake15 closed 2 months ago

shakenbake15 commented 3 months ago

Thanks for checking out my previous update to text parsing. I have further enhanced and found that this will work even better.

Goal for text parsing: get each line to be a medium length sentence that is optimal for tortoise tts, reduce repeating text and weird effects.

Text Parsing summary:

  1. Sentences under 7 words have a short bracketed text inserted so that there is less chance of repetition
  2. Sentences over 20 words are going to look for commas and split them up into smaller blocks at commas or dashes

Goal for silences was to reduce the amount they are produced by tortoise tts. I've noticed the more silence you give tortoise tts to play around with, the more likely it's going to make weird sounds or repeat text that should not be repeated.

Silence:

  1. I'd recommend the following, lower the pause_size field in your tort.yaml to 1. I'm getting better results this way.
  2. I added a flag to the text as it generates to indicate if the line is the beginning of a new paragraph. Then, I updated the code for when the audiobook is merged to add .5 second pauses in those locations. I find this adds breaks at more natural places than just moving the slider to add silences between lines. Also this prevents longer silences where the text parsing has made a break at a comma.

Sorry I wasn't on github to see the comments you had on my last updates. I'll look at turning on notifications in the app or something so hopefully I'll catch it if you respond to these changes.