knochenhans / tts_arranger

MIT License
5 stars 1 forks source link

Issue with TTS_Writer #2

Open pigalot opened 1 year ago

pigalot commented 1 year ago

Using some fairly simple code to try out making a basic audio book from a epub.

from tts_arranger import TTS_EPUB_Reader, TTS_Writer

preferred_speakers = ['p273', 'p330']

reader = TTS_EPUB_Reader(preferred_speakers)
reader.load('C:/dev/ebooks/data/1.epub')

project = reader.get_project()

writer = TTS_Writer(project, 'C:/dev/ebooks/output/one/', preferred_speakers = preferred_speakers)
writer.synthesize_and_write(project.author + ' - ' + project.title)

writer.synthesize_and_write works but will randomly error with Synthesizing project .... failed: Error synthesizing ...: Dimension out of range (expected to be in range of [-2, 1], but got 2)".

Not found any common issue with what its trying to synthesize at the time, I have not see it happen on the same item more than once and I have seen it happen on multiple epubs.

Any idea what this might be? Is there a good way to force a retry or something?

On a side note this is such a cool project would be huge to be able to get epubs read out.

knochenhans commented 1 year ago

Hi, I’ve been running into this as well lately and have been unable to reproduce this reliably or find out the root cause. Seems this is a bug with the underlying TTS engine (https://github.com/coqui-ai/TTS/discussions/2516). I’ll release an update with a workaround that will catch the exception and synthesize the item again shortly for now.

Also, thanks :) let me now in case you have any ideas for further improvements, as this mostly grew around my personal needs (reading out ebooks and websites/blogs). Though I guess more examples and documentation would be a good start...

pigalot commented 1 year ago

Managed to get everything running from code and its been running for a few hours with no errors so looking good.