aedocw / epub2tts

Turn an epub or text file into an audiobook
Apache License 2.0
445 stars 44 forks source link

EPUB3 Support #120

Closed ndevries84 closed 3 months ago

ndevries84 commented 7 months ago

Any chance you are able to add EPUB3 support, What I mean by that is instead of a m4b file, it would still be epub with a multimedia layer. This would mean you can read the book normally, with tts and highlighted words, as well just it like an audiobook.

Its really fantastic for people with dyslexia, or just those who like immersive reading like Amazon's whispersync

Here is an article that explain SMIL and the EPUB3.3 standard https://kb.daisy.org/publishing/docs/sync-media/overlays.html

aedocw commented 7 months ago

I am not at all familiar with EPUB3 but it looks really neat.

This would be a significant effort since it would require basically re-writing the original source material as the audio clips are being generated. It's definitely possible and would be some pretty interesting work. The way the audio is generated right now is at least vaguely compatible, though each generated clip is about 1 minute long, which is probably too long for this to be really useful. I imagine chunking down to one sentence at a time would be ideal.

Really cool idea, I'll definitely leave this open. I don't think I'll have the time to tackle this any time soon but I would be happy to accept a contribution if anyone else wants to make an attempt.

aedocw commented 7 months ago

Here's a solution someone else wrote, looks really neat. You provide the DRM-free epub and the audiobook m4b, and it lets you do synchronized reading: https://smoores.gitlab.io/storyteller/

rejuce commented 6 months ago

I am spinning up a storyteller instance and see how it goes. would it make sense to add an option to specify and storyteller API server option or better as separate script to push there? - did not find API doc yet, guess one has to dig from source or overserve the storyteller web, which calls it makes to API container

aedocw commented 6 months ago

I am not really sure how well integration with storyteller would work out at this time, but if you have ideas on what that would look like, please share.

rejuce commented 6 months ago

If I can find out how the storyteller API works, I would make an additional script that pushes the final audiobook + text src to the storybook API server to make it into a epub3

it certainly is not super efficient as stortelelr first makes the audio back into text, to match it with the src material

if that works though epub2tts maybe needs one more parameter to specify storyteller api server, then epub2tts could call the script once its finished with making the final audiofile. advantage of that would be that we would get end-to-end from src to epub3 without touching the covnersion server in between

aedocw commented 3 months ago

Closing this for now, there's no way I'll have time to work on this. It would be pretty complicated, as it would still need to do a full transcript creation with whisper after each chapter is done (due to the removal of long silences and normalization of pauses between sentences). I'm not sure doing this in epub2tts would be worth the effort. Happy to review and consider PRs from anyone who wants to add it though.