aedocw / epub2tts

Turn an epub or text file into an audiobook
Apache License 2.0
433 stars 43 forks source link

" get sometimes split into a individual token ending as audio glitch #148

Closed rejuce closed 6 months ago

rejuce commented 6 months ago

Text splitted to sentences.

['The new field of Monsterology that Ichiha created has led to new perspectives on the origin of man, animal, plant, and monster.', '”', 'The dungeon theory on the origin of life, huh?', "This was also information we hadn't made public, but, Yeah, she has a point.", '“In my view, this is a phenomenon that fuses science and religion, and links our worlds together. You must have some sense of that yourself,” Genia continued, sounding uncharacteristically serious.']

possible workaround: remove all " before piping to tts??

aedocw commented 6 months ago

I think I'll add in a check during segmentation so it throws out anything that has no characters at all. I've seen other things happen like this, like "..." gets segmented as it's own sentence, and TTS just makes some random sounds. I expect if it required every sentence to have at least one character in it, that should pretty effectively prevent issues like this.

Thanks for logging the bug with this information, really appreciate it! I should be able to get to this soon.

aedocw commented 6 months ago

I can't seem to trick the tokenizer into creating just a single item with just a quote in it. I added stuff in the "dont-say-that" branch to drop any sentences that do not have any letters or numbers in them though. If you can check that branch out and try it against the copy you have I would appreciate it. Thanks!

aedocw commented 6 months ago

I manually inserted some items like just '"' as a sentence, and some others that had nothing but punctuation, and they are safely removed with this. I'm going to merge it, as I'm eager to do everything possible to sanitize what gets sent to TTS.

rejuce commented 6 months ago

how do I update? pull and then pip install . again?

aedocw commented 6 months ago

Yes, update with "git pull" and then you can do "pip install . --update"