p0n1 / epub_to_audiobook

EPUB to audiobook converter, optimized for Audiobookshelf
MIT License
1.16k stars 119 forks source link

feat: local tts support #18

Open timgreen opened 1 year ago

timgreen commented 1 year ago

use piper as default.

issue #16

p0n1 commented 1 year ago

Wow. Neat work. Will test and work on merge after this weekend. Thanks for the contribution.

p0n1 commented 1 year ago

I'm experiencing some issues with installing Piper on Mac, and I'm still working on resolving them.

timgreen commented 1 year ago

@p0n1, I included another example in the README for https://github.com/coqui-ai/TTS, maybe you could try that instead.

p0n1 commented 1 year ago

Hi @timgreen. I didn't expect that it could support both Coqui and Piper at the same time. That's really cool. I originally planned to test the use of Piper in a Linux environment, without considering the installation issues on Mac. However, my work has been busy lately, so I have to postpone it.

In addition, there are new contributors joining recently. @Bryksin has done a great job and has comprehensively refactored the code to facilitate the future integration of more TTS engines. You can find some discussions here: https://github.com/p0n1/epub_to_audiobook/issues/21#issuecomment-1824987948.

I apologize again for not being able to merge your code in a timely manner. Perhaps it would be better if you could contribute based on the refactored code once it got merged. Or I can help if you don't have the time.

Beside, we have a discord server now for syncing up and any discussion. Feel free to join. You could find the invite url here https://github.com/p0n1/epub_to_audiobook/issues/15#issuecomment-1825854839.

EnderSyth commented 7 months ago

@timgreen I've attempted to build your branch to see if I could get the piper TTS working as I find it far better than Edge_TTS, however when running the command to use --tts local I get the following error

epub_to_audiobook.py: error: argument --tts: invalid choice: 'local' (choose from 'azure', 'openai')

Apologizes if this isn't the correct place to reach out with a question on this.

timgreen commented 7 months ago

@EnderSyth, this PR hasn't been merged yet. So you will need to try from my branch: https://github.com/timgreen/epub_to_audiobook/tree/local_tts

EnderSyth commented 7 months ago

@EnderSyth, this PR hasn't been merged yet. So you will need to try from my branch: https://github.com/timgreen/epub_to_audiobook/tree/local_tts

That is the one I cloned

git clone https://github.com/timgreen/epub_to_audiobook.git

It executes via 'epub_to_audiobook.py' which I believe is unique to your branch.

After following the normal build process I used the code example on that Repo

`python3 epub_to_audiobook.py "path/to/book.epub" "path/to/output/folder" --tts local'

But modified to the following for my environment. python .\epub_to_audiobook.py .\Saved_by_certain.epub .\Test\ --tts local

usage: epub_to_audiobook.py [-h] [--tts {azure,openai}] [--log LOG] [--preview] [--language LANGUAGE] [--newline_mode {single,double}] [--chapter_start CHAPTER_START] [--chapter_end CHAPTER_END] [--output_text] [--remove_endnotes] [--voice_name VOICE_NAME] [--break_duration BREAK_DURATION] [--output_format OUTPUT_FORMAT] [--openai_model OPENAI_MODEL] [--openai_voice OPENAI_VOICE] [--openai_format OPENAI_FORMAT] input_file output_folder epub_to_audiobook.py: error: argument --tts: invalid choice: 'local' (choose from 'azure', 'openai')

danielw97 commented 7 months ago

Timgreen, thanks for your work on this. Whilst it hasn't been murged yet unfortunately, I've been able to test this branch locally on Linux and it's working well for me so far. Local tts is something I've been waiting on with this utility and piper is a good choice in my book.

danielw97 commented 7 months ago

EnderSyth , make sure to checkout the local_tts branch after cloning the repo

EnderSyth commented 7 months ago

EnderSyth , make sure to checkout the local_tts branch after cloning the repo

Why thank you, I'm new to this so I missed that bit. After doing that indeed I can run it, though now its giving me issues with piper not being a recognized command. I'm trying to figure out how to get that installed but appear to not be able to get pip install piper-tts working due to missing piper-phonemize which also can't be found. But at least I'm one step further thank you.

danielw97 commented 7 months ago

no problem at all, happy to help. Are you running python 3.12 by any chance? I tried to build this on Ubuntu 24.04 and ran into a similar issue, although setting up a venv (virtual environment with python 3.11) fixed it. Hope this helps.

EnderSyth commented 7 months ago

I'm doing this under WSL on Ubuntu 20.04.6 LTS. I get Python 3.10.11 as the output running python -V.

I saw many posts when looking into the errors about different versions of python causing issues but 3.10 was supposed to be good from what I read.

EnderSyth commented 7 months ago

no problem at all, happy to help. Are you running python 3.12 by any chance? I tried to build this on Ubuntu 24.04 and ran into a similar issue, although setting up a venv (virtual environment with python 3.11) fixed it. Hope this helps.

Well I bite the bullet deleted the venv and decided to redo it with python 3.11, but thanks to WSL I couldn't install Python 3.11...long story short Bing GPT was able to guide me through compiling my own Python 3.11 and somehow it works!

Now I just have to figure out how to change to libritts_r properly. I think I need the --voice_name tag so going to play with that next.

vcalv commented 4 months ago

So I didn't see this and just opened a pull request specifically for piper.

See #77

On the one hand it's less generic on the other hand it "maps" parameters like silence and speed directly into piper parameters.

Any suggestions welcome.

Bryksin commented 3 months ago

This PR is out of date and seems like implemented before major refactoring and project restructurisation. I will close it soon if it will not be updated

p0n1 commented 2 months ago

I think this local_tts feature is quite flexible. I might adapt it to the latest code when I'm available.