p0n1 / epub_to_audiobook

EPUB to audiobook converter, optimized for Audiobookshelf
MIT License
1.04k stars 107 forks source link

Epubs with references do not convert properly #69

Open estrellagus opened 3 months ago

estrellagus commented 3 months ago

Checking a few books that have references and citation links (to other parts of the book) do not process properly. The citation links are each read first, and then none of the text on the chapter is converted.

I have included a sample open source pub that shows the issue. georgia-pls-ssml.epub.zip

Suspect this is an issue with the public library OR how being called but beyond my programming abilities.

estrellagus commented 3 months ago

Did some more digging running with debug mode and text output, and noted that the spoken text is preceded by the code '@BRK#'. So, just as a test added a global replace for this phrase and now the file is properly processed.

So on file pub_book_parser.py, added the code below on line 68 -

        # replace break characters with a newline. 
        cleaned_text = re.sub(r'@BRK#' , '\n' , cleaned_text)

So far all the testing is working for me across many files.

p0n1 commented 3 months ago

Nice. Will try your fix. Thanks!