DrewThomasson / ebook2audiobookXTTS

Generates an audiobook with chapters and ebook metadata using Calibre and Xtts from Coqui tts, and with optional voice cloning, and supports multiple languages
MIT License
637 stars 69 forks source link

error on basic command #17

Closed ROBERT-MCDOWELL closed 2 weeks ago

ROBERT-MCDOWELL commented 2 weeks ago

/home/src/AI/ebook2audiobookXTTS/env_py311/lib64/python3.11/site-packages/torch/cuda/init.py:654: UserWarning: Can't initialize NVML warnings.warn("Can't initialize NVML")

Using model: xtts /home/src/AI/ebook2audiobookXTTS/env_py311/lib64/python3.11/site-packages/TTS/utils/io.py:54: FutureWarning: You are using torch.load with weights_only=False (the current default value), which uses the default pickle module implicitly. It is possible to construct malicious pickle data which will execute arbitrary code during unpickling (See https://github.com/pytorch/pytorch/blob/main/SECURITY.md#untrusted-models for more details). In a future release, the default value for weights_only will be flipped to True. This limits the functions that could be executed during unpickling. Arbitrary objects will no longer be allowed to be loaded via this mode unless they are explicitly allowlisted by the user via torch.serialization.add_safe_globals. We recommend you start setting weights_only=True for any use case where you don't have full control of the loaded file. Please open an issue on GitHub for any issues related to this experimental feature. return torch.load(f, map_location=map_location, **kwargs) GPT2InferenceModel has generative capabilities, as prepare_inputs_for_generation is explicitly overwritten. However, it doesn't directly inherit from GenerationMixin. From πŸ‘‰v4.50πŸ‘ˆ onwards, PreTrainedModel will NOT inherit from GenerationMixin, and this model will lose the ability to call generate and other related functions.

  • If you're using trust_remote_code=True, you can get rid of this warning by loading the model with an auto class. See https://huggingface.co/docs/transformers/en/model_doc/auto#auto-classes
  • If you are the owner of the model architecture code, please modify your model class such that it inherits from GenerationMixin (after PreTrainedModel, otherwise you'll get an exception).
  • If you are not the owner of the model architecture class, please contact the model code owner to update it. Traceback (most recent call last): File "/home/src/AI/ebook2audiobookXTTS/ebook2audiobook.py", line 461, in convert_chapters_to_audio(chapters_directory, output_audio_directory, target_voice, language) File "/home/src/AI/ebook2audiobookXTTS/ebook2audiobook.py", line 412, in convert_chapters_to_audio sentences = sent_tokenize(chapter_text, language='italian' if language == 'it' else 'english') ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/src/AI/ebook2audiobookXTTS/env_py311/lib64/python3.11/site-packages/nltk/tokenize/init.py", line 119, in sent_tokenize tokenizer = _get_punkt_tokenizer(language) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/src/AI/ebook2audiobookXTTS/env_py311/lib64/python3.11/site-packages/nltk/tokenize/init.py", line 105, in _get_punkt_tokenizer return PunktTokenizer(language) ^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/src/AI/ebook2audiobookXTTS/env_py311/lib64/python3.11/site-packages/nltk/tokenize/punkt.py", line 1744, in init self.load_lang(lang) File "/home/src/AI/ebook2audiobookXTTS/env_py311/lib64/python3.11/site-packages/nltk/tokenize/punkt.py", line 1749, in load_lang lang_dir = find(f"tokenizers/punkt_tab/{lang}/") ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/src/AI/ebook2audiobookXTTS/env_py311/lib64/python3.11/site-packages/nltk/data.py", line 579, in find raise LookupError(resource_not_found) LookupError:

    Resource punkt_tab not found. Please use the NLTK Downloader to obtain the resource:

import nltk nltk.download('punkt_tab')

For more information see: https://www.nltk.org/data.html

Attempted to load tokenizers/punkt_tab/english/

Searched in:

any idea how to solve it?

DrewThomasson commented 2 weeks ago

## Replace

nltk.download('punkt_tab')

## With

nltk.download('punkt')

DrewThomasson commented 2 weeks ago

I misread your error it's actually

Just run this command in your terminal lol

python -m nltk.downloader punkt
DrewThomasson commented 2 weeks ago

Or you could just run the docker

Then you won't have any issues lol

DrewThomasson commented 2 weeks ago

Did it fix it?

ROBERT-MCDOWELL commented 2 weeks ago

I'm running the docker indeed. so I cannot use the command above :(

ROBERT-MCDOWELL commented 2 weeks ago

but wait, how to use the command line with the docker? I don't want to use the GUI as my server is headless. I think I must learn more about docker.

DrewThomasson commented 2 weeks ago

oh hm, actually good point,

I'll, put all of the GitHub repo files into the docker image and come back to you with a modified docker launch command for headless

ROBERT-MCDOWELL commented 2 weeks ago

ok I cleaned up completely docker and restart to install your docker but now I get starting... [nltk_data] Error loading punkt_tab: <urlopen error [Errno -3] [nltk_data] Temporary failure in name resolution> Traceback (most recent call last): File "/usr/local/lib/python3.10/urllib/request.py", line 1348, in do_open h.request(req.get_method(), req.selector, req.data, headers, File "/usr/local/lib/python3.10/http/client.py", line 1283, in request self._send_request(method, url, body, headers, encode_chunked) File "/usr/local/lib/python3.10/http/client.py", line 1329, in _send_request self.endheaders(body, encode_chunked=encode_chunked) File "/usr/local/lib/python3.10/http/client.py", line 1278, in endheaders self._send_output(message_body, encode_chunked=encode_chunked) File "/usr/local/lib/python3.10/http/client.py", line 1038, in _send_output self.send(msg) File "/usr/local/lib/python3.10/http/client.py", line 976, in send self.connect() File "/usr/local/lib/python3.10/http/client.py", line 1448, in connect super().connect() File "/usr/local/lib/python3.10/http/client.py", line 942, in connect self.sock = self._create_connection( File "/usr/local/lib/python3.10/socket.py", line 836, in create_connection for res in getaddrinfo(host, port, 0, SOCK_STREAM): File "/usr/local/lib/python3.10/socket.py", line 967, in getaddrinfo for res in _socket.getaddrinfo(host, port, family, type, proto, flags): socket.gaierror: [Errno -3] Temporary failure in name resolution

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "/home/user/app/app.py", line 30, in import download_tos_agreed_file File "/home/user/app/download_tos_agreed_file.py", line 23, in download_tos_agreed() File "/home/user/app/download_tos_agreed_file.py", line 17, in download_tos_agreed urllib.request.urlretrieve(file_url, file_path) File "/usr/local/lib/python3.10/urllib/request.py", line 241, in urlretrieve with contextlib.closing(urlopen(url, data)) as fp: File "/usr/local/lib/python3.10/urllib/request.py", line 216, in urlopen return opener.open(url, data, timeout) File "/usr/local/lib/python3.10/urllib/request.py", line 519, in open response = self._open(req, data) File "/usr/local/lib/python3.10/urllib/request.py", line 536, in _open result = self._call_chain(self.handle_open, protocol, protocol + File "/usr/local/lib/python3.10/urllib/request.py", line 496, in _call_chain result = func(*args) File "/usr/local/lib/python3.10/urllib/request.py", line 1391, in https_open return self.do_open(http.client.HTTPSConnection, req, File "/usr/local/lib/python3.10/urllib/request.py", line 1351, in do_open raise URLError(err) urllib.error.URLError: <urlopen error [Errno -3] Temporary failure in name resolution>

DrewThomasson commented 2 weeks ago

Here is a rushed implementation, I haven't tested it yet but I'm giving it to you before I head to my university so I can leave you with something that might work till I get back.

We will mount the ebook2audiobookXTTS/ folder from your local machine to the Docker container, so the app can access any input files (e.g., EPUB files) placed in that folder.

Docker Run Command Breakdown

Here’s the full Docker command to achieve this:

docker run -it --rm -p 7860:7860 --platform=linux/amd64 --pull always -v /path/to/local/ebook2audiobookXTTS:/app/ebook2audiobookXTTS registry.hf.space/drewthomasson-ebook2audiobookxtts:latest python /app/ebook2audiobookXTTS/ebook2audiobook.py ebook2audiobook.py <path_to_ebook_file> [path_to_voice_file] [language_code]
Explanation of the Command:

-it: Runs the container interactively.

--rm: Automatically removes the container after it exits.

-p 7860:7860: Exposes port 7860 so you can access the Gradio interface or any other services.

--platform=linux/amd64: Ensures compatibility with the amd64 architecture.

--pull always: Ensures the latest version of the Docker image is pulled each time.

-v /path/to/local/ebook2audiobookXTTS:/app/ebook2audiobookXTTS: Mounts the local ebook2audiobookXTTS/ folder to the container, allowing access to the app and input files. registry.hf.space/drewthomasson-ebook2audiobookxtts:latest: Specifies the Docker image to use.

python /app/ebook2audiobookXTTS/ebook2audiobook.py : Runs the specified Python app within the container

DrewThomasson commented 2 weeks ago

I will have a tested and working implementation of this before the end of today though.

DrewThomasson commented 2 weeks ago

ok I cleaned up completely docker and restart to install your docker but now I get starting... [nltk_data] Error loading punkt_tab: <urlopen error [Errno -3] [nltk_data] Temporary failure in name resolution> Traceback (most recent call last): File "/usr/local/lib/python3.10/urllib/request.py", line 1348, in do_open h.request(req.get_method(), req.selector, req.data, headers, File "/usr/local/lib/python3.10/http/client.py", line 1283, in request self._send_request(method, url, body, headers, encode_chunked) File "/usr/local/lib/python3.10/http/client.py", line 1329, in _send_request self.endheaders(body, encode_chunked=encode_chunked) File "/usr/local/lib/python3.10/http/client.py", line 1278, in endheaders self._send_output(message_body, encode_chunked=encode_chunked) File "/usr/local/lib/python3.10/http/client.py", line 1038, in _send_output self.send(msg) File "/usr/local/lib/python3.10/http/client.py", line 976, in send self.connect() File "/usr/local/lib/python3.10/http/client.py", line 1448, in connect super().connect() File "/usr/local/lib/python3.10/http/client.py", line 942, in connect self.sock = self._create_connection( File "/usr/local/lib/python3.10/socket.py", line 836, in create_connection for res in getaddrinfo(host, port, 0, SOCK_STREAM): File "/usr/local/lib/python3.10/socket.py", line 967, in getaddrinfo for res in _socket.getaddrinfo(host, port, family, type, proto, flags): socket.gaierror: [Errno -3] Temporary failure in name resolution

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "/home/user/app/app.py", line 30, in import download_tos_agreed_file File "/home/user/app/download_tos_agreed_file.py", line 23, in download_tos_agreed() File "/home/user/app/download_tos_agreed_file.py", line 17, in download_tos_agreed urllib.request.urlretrieve(file_url, file_path) File "/usr/local/lib/python3.10/urllib/request.py", line 241, in urlretrieve with contextlib.closing(urlopen(url, data)) as fp: File "/usr/local/lib/python3.10/urllib/request.py", line 216, in urlopen return opener.open(url, data, timeout) File "/usr/local/lib/python3.10/urllib/request.py", line 519, in open response = self._open(req, data) File "/usr/local/lib/python3.10/urllib/request.py", line 536, in _open result = self._call_chain(self.handle_open, protocol, protocol + File "/usr/local/lib/python3.10/urllib/request.py", line 496, in _call_chain result = func(*args) File "/usr/local/lib/python3.10/urllib/request.py", line 1391, in https_open return self.do_open(http.client.HTTPSConnection, req, File "/usr/local/lib/python3.10/urllib/request.py", line 1351, in do_open raise URLError(err) urllib.error.URLError: <urlopen error [Errno -3] Temporary failure in name resolution>

oh and I didnt forget about this, hm

what was your full docker run command for it?

DrewThomasson commented 2 weeks ago

oh I see.... I'll get the docker image to be easily runnable in a public manner as well

ROBERT-MCDOWELL commented 2 weeks ago

the command was docker run -it --rm -p 7860:7860 --platform=linux/amd64 athomasson2/ebook2audiobookxtts:huggingface python app.py

then tried the one above docker run -it --rm -p 7860:7860 --platform=linux/amd64 --pull always -v /path/to/local/ebook2audiobookXTTS:/app/ebook2audiobookXTTS registry.hf.space/drewthomasson-ebook2audiobookxtts:latest python /app/ebook2audiobookXTTS/ebook2audiobook.py ebook2audiobook.py [path_to_voice_file] [language_code]

gives latest: Pulling from drewthomasson-ebook2audiobookxtts eaa6071ac135: Pull complete 4887aa1af7bc: Pull complete 2fe776c703b1: Pull complete e9ca11e1d57b: Pull complete 8b352e63b19a: Pull complete d5f6fa00f923: Pull complete ea8411a744a3: Pull complete 9e788e27f6ff: Pull complete 806a30b761f2: Pull complete Digest: sha256:0eb4f626af8442ccd4086694522768597d0f571c1e3fbcfcbb570ea61715e720 Status: Downloaded newer image for registry.hf.space/drewthomasson-ebook2audiobookxtts:latest starting... [nltk_data] Error loading punkt: <urlopen error [Errno -3] Temporary [nltk_data] failure in name resolution> Wiping and removeing Working_files folder... Error removing ./Working_files: [Errno 2] No such file or directory: './Working_files' Wiping and and removeing chapter_wav_files folder... Error removing ./Chapter_wav_files: [Errno 2] No such file or directory: './Chapter_wav_files' Created directory: ./Working_files/Book The file ./Working_files/temp.epub does not exist. Cannot read from /home/user/app/ebook2audiobook.py An error occurred while converting the eBook: Command '['ebook-convert', 'ebook2audiobook.py', './Working_files/temp.epub']' returned non-zero exit status 1. [nltk_data] Error loading punkt: <urlopen error [Errno -3] Temporary [nltk_data] failure in name resolution> Traceback (most recent call last): File "/app/ebook2audiobookXTTS/ebook2audiobook.py", line 458, in create_chapter_labeled_book(ebook_file_path) File "/app/ebook2audiobookXTTS/ebook2audiobook.py", line 259, in create_chapter_labeled_book process_chapter_files(folder_path, output_csv) File "/app/ebook2audiobookXTTS/ebook2audiobook.py", line 235, in process_chapter_files chapter_files = sorted(os.listdir(folderpath), key=lambda x: int(x.split('')[1].split('.')[0])) FileNotFoundError: [Errno 2] No such file or directory: './Working_files/temp_ebook'

DrewThomasson commented 2 weeks ago

I'll get back to you when I have a tested and working version on my computer that doesn't rly on the gradio interface

DrewThomasson commented 2 weeks ago

btw what are you running this headless env on anyway?

Ubuntu? Windows? Mac? Arch linux? Fedora?

ROBERT-MCDOWELL commented 2 weeks ago

Linux Fedora 40

DrewThomasson commented 2 weeks ago

Still working on it lol

About to drop a new and improved version that does all of the other scripts functionality in a docker offline lol

DrewThomasson commented 2 weeks ago

ok so here you get a preview while I work out the bugs lol this seems to work out on my end just fine for me

first for a docker pull of the latest with

docker pull registry.hf.space/drewthomasson-ebook2audiobookxtts:latest
docker run -it --rm \
    -v $(pwd)/input-folder:/home/user/app/input_folder \
    -v $(pwd)/Audiobooks:/home/user/app/Audiobooks \
    --platform linux/amd64 \
    registry.hf.space/drewthomasson-ebook2audiobookxtts:latest \
    python app.py --headless True --ebook /home/user/app/input_folder/YOUR_INPUT_FILE.TXT

To get the help command for the other parameters this program has you can run this

docker run -it --rm \
    --platform linux/amd64 \
    registry.hf.space/drewthomasson-ebook2audiobookxtts:latest \
    python app.py -h

and that will output this

user/app/ebook2audiobookXTTS/input-folder -v $(pwd)/Audiobooks:/home/user/app/ebook2audiobookXTTS/Audiobooks --memory="4g" --network none --platform linux/amd64 registry.hf.space/drewthomasson-ebook2audiobookxtts:latest python app.py -h
starting...
usage: app.py [-h] [--share SHARE] [--headless HEADLESS] [--ebook EBOOK] [--voice VOICE]
              [--language LANGUAGE] [--use_custom_model USE_CUSTOM_MODEL]
              [--custom_model CUSTOM_MODEL] [--custom_config CUSTOM_CONFIG]
              [--custom_vocab CUSTOM_VOCAB] [--custom_model_url CUSTOM_MODEL_URL]

Convert eBooks to Audiobooks using a Text-to-Speech model. You can either launch the
Gradio interface or run the script in headless mode for direct conversion.

options:
  -h, --help            show this help message and exit
  --share SHARE         Set to True to enable a public shareable Gradio link. Defaults
                        to False.
  --headless HEADLESS   Set to True to run in headless mode without the Gradio
                        interface. Defaults to False.
  --ebook EBOOK         Path to the ebook file for conversion. Required in headless
                        mode.
  --voice VOICE         Path to the target voice file for TTS. Optional, uses a default
                        voice if not provided.
  --language LANGUAGE   Language for the audiobook conversion. Options: en, es, fr, de,
                        it, pt, pl, tr, ru, nl, cs, ar, zh-cn, ja, hu, ko. Defaults to
                        English (en).
  --use_custom_model USE_CUSTOM_MODEL
                        Set to True to use a custom TTS model. Defaults to False. Must
                        be True to use custom models, otherwise you'll get an error.
  --custom_model CUSTOM_MODEL
                        Path to the custom model file (.pth). Required if using a custom
                        model.
  --custom_config CUSTOM_CONFIG
                        Path to the custom config file (config.json). Required if using
                        a custom model.
  --custom_vocab CUSTOM_VOCAB
                        Path to the custom vocab file (vocab.json). Required if using a
                        custom model.
  --custom_model_url CUSTOM_MODEL_URL
                        URL to download the custom model as a zip file. Optional, but
                        will be used if provided. Examples include David Attenborough's
                        model: 'https://huggingface.co/drewThomasson/xtts_David_Attenbor
                        ough_fine_tune/resolve/main/Finished_model_files.zip?download=tr
                        ue'. More XTTS fine-tunes can be found on my Hugging Face at
                        'https://huggingface.co/drewThomasson'.

Example: python script.py --headless --ebook path_to_ebook --voice path_to_voice
--language en --use_custom_model True --custom_model model.pth --custom_config
config.json --custom_vocab vocab.json
DrewThomasson commented 2 weeks ago

bon appetit

ROBERT-MCDOWELL commented 2 weeks ago

ok I will give a try as soon as I have a time ;) btw why $(pwd)?

DrewThomasson commented 2 weeks ago

It's just makes it easier on you lol

Then it creates the folder at the working directory for you instead of you having to type out the full path

You can use the full path if you want too though

ROBERT-MCDOWELL commented 2 weeks ago

oops, I mixed pwd with passwd :-\

ROBERT-MCDOWELL commented 2 weeks ago

Seems to work :D! it needs a little time as my server is on CPU and will give you the final report. I will certainly try to install it on a virtual env. However I have some suggestions:

keep it going, your project is becoming pretty nice!

DrewThomasson commented 2 weeks ago

WOOT! πŸŽ‰πŸŽ‰

ALSO THX!

Those are good suggestions ngl lol

I'll look at adding them

I'm also looking at doing another update that'll potentially allow you to change the generation speed and such of Xtts,

Also the program should automatically grab onto the nearest CUDA capable device it sees, and if not it'll just use CPU

ROBERT-MCDOWELL commented 2 weeks ago

yup, that's a good option too indeed. I noticed also during the log process that the total remaining segments is not visible so we cannot have a time estimation or the remaining segments to convert.... forget NPU, it's too new and btw I really don't like what they (the device makers) want to do with.... akind of super spy. Also did you think about crashes or unwanted end of the process so have a way to resume what's already converted?

DrewThomasson commented 2 weeks ago

I don't have a resume or pause feature at the moment sadly lol

That's also up there on the list lol

The Best I can do right now is point to how to pause or resume a docker image lol

https://github.com/DrewThomasson/VoxNovel/issues/21#issuecomment-2366910480

ROBERT-MCDOWELL commented 2 weeks ago

is each segment = one audio file?, or does it go to into one global file? or the RAM until it's finished?

DrewThomasson commented 2 weeks ago

yeah, I made sure to not rely on ram lol,

Each segment in the code is processed separately into individual audio files (fragments)

Here's a short summary of how the process works:

  1. Chapter Processing:

    • The ebook is split into chapters and then further split into sentences or fragments based on language and length constraints.
  2. Fragment Generation:

    • Each fragment is generated into its own temporary .wav file in a "temp" directory. This avoids loading everything into RAM at once, which helps with memory management.
  3. Combining Audio:

    • After all fragments of a chapter are generated, they are combined into a single chapter audio file (e.g., audio_chapter_1.wav).
  4. Final M4B Audiobook:

    • After all chapter audio files are created, they are merged into one final M4B audiobook file using ffmpeg.

So, each fragment gets saved as an individual audio file, combined into chapter audio files, and ultimately merged into a single audiobook file at the end.

ROBERT-MCDOWELL commented 2 weeks ago

good, so it should be easy on resume to check what are already done and jump to the next segment....

DrewThomasson commented 1 week ago

lol yup

ROBERT-MCDOWELL commented 1 week ago

--device ["gpu","cpu"] added if params does not exists or if params is gpu but no gpu avaliable so it will be "cpu" I made also some more tolerant params behavior concerning --headless and --custom*, I will do a full explanation of the changes once the PR will be done. btw, if you think the PR will be too shocking for you (I mean all the code refactored) so maybe I can create a fresh new repo and work in parallel with your repo without to disturb your habits....

DrewThomasson commented 1 week ago

Nice