p0n1 / epub_to_audiobook

EPUB to audiobook converter, optimized for Audiobookshelf
MIT License
1.16k stars 119 forks source link

We Should Be Able to Search And Replace Text #80

Closed reverendj1 closed 2 months ago

reverendj1 commented 4 months ago

There are many times when it would be beneficial to search and replace text in a book, before generating the audio narration. The biggest reasons would be to:

This PR adds this functionality, by allowing the user to specify a simple text file like this:

# this is the general structure
<search>==<replace>
# this is a comment
# fix cardinal direction abbreviations
N\.E\.==north east
# be careful with your regexes, as this would also match Sally N. Smith
N\.==north
# pronounce Barbadoes like the locals
Barbadoes==Barbayduss
python3 main.py examples/The_Life_and_Adventures_of_Robinson_Crusoe.epub output_folder --search_and_replace_file search.conf
Bryksin commented 3 months ago

Hmmm sounds reasonable from one point of view, but from another - book preparation for the audio is a complex task and better to use the proper epub editor to edit the book, this project is not about editing book but rather taking what is there...

I don't know about this PR, I'm in doubt... @p0n1 need your final decision

p0n1 commented 3 months ago

Hmmm sounds reasonable from one point of view, but from another - book preparation for the audio is a complex task and better to use the proper epub editor to edit the book, this project is not about editing book but rather taking what is there...

I don't know about this PR, I'm in doubt... @p0n1 need your final decision

Thank @reverendj1 for implementing this. While I haven't been in a similar situation myself, I can imagine this feature being very helpful to those who need it. And the code for this PR is minimally intrusive and doesn't affect other modules. I'm in favor of merging it.

reverendj1 commented 3 months ago

Thank you both for your work on this great project. I have a feeling I will be using it a lot! I almost always buy the audiobook alongside the ebook, but many books I read simply don't have an audiobook version.

@Bryksin I feel like it's akin to the --remove_endnotes or --newline_mode options. You aren't trying to fix up the epub, you are making changes that are specific to the processing of it into an audiobook (on the fly). Many of my books I have use the same abbreviations and foreign words/characters in them, even though they are English books. I'll probably end up with dozens of these replacements that need to be performed on each of the books on this subject. With this method, I can easily create one file that fixes those issues and apply it to any of those kinds of books during transcoding to an audiobook. Otherwise, I'd have to copy each epub, open and manually modify them in an epub editor and finally delete the edited epub post audiobook processing.

@p0n1 I chose examples from Robinson Crusoe to match existing documentation and make it easier to show how to use it. However, when the AI gets the pronunciation of the subject matter or main character of a book wrong, and it's repeated in every other line, it becomes quite a distraction!

Thank you for looking at my PR.

p0n1 commented 2 months ago

Merged! @reverendj1 Thanks again.