lukaszliniewicz / Pandrator

Pandrator aspires to be a user-friendly app with a graphical interface and a one-click installer that creates high-quality speech from text in multiple languages (audiobooks, speech synchronised with subtitles and more) using local models (XTTS, Silero or VoiceCraft), plus voice cloning, LLM pre-processing, RVC enhancement, and automatic evaluation
GNU Affero General Public License v3.0
263 stars 18 forks source link

Crashing on 'Start Generation' #25

Open Troyificus opened 2 months ago

Troyificus commented 2 months ago

Hi, I've tried this with both an XTTS and Silero API server and the issue is occurring with both of them, so I'm guessing it's something I'm doing wrong else. Basically I'm following the steps; creating a session, uploading a source, choose a TTS service and clicking on Start Generation, at which point the GUI will go 'Not Responding' and will need to be terminated. Here's the output from the terminals;

Pandrator:

C:\Users\Trism\Pandrator\Pandrator>python pandrator.py
pygame 2.5.2 (SDL 2.28.3, Python 3.12.4)
Hello from the pygame community. https://www.pygame.org/contribute.html
[nltk_data] Downloading package punkt to
[nltk_data]     C:\Users\Trism\AppData\Roaming\nltk_data...
[nltk_data]   Package punkt is already up-to-date!
on_language_selected method called
2024-06-23 15:41:11,894 - DEBUG - Starting new HTTP connection (1): localhost:8001
2024-06-23 15:41:14,712 - DEBUG - http://localhost:8001 "POST /tts/language HTTP/1.1" 200 0
2024-06-23 15:41:14,713 - DEBUG - Starting new HTTP connection (1): localhost:8001
2024-06-23 15:41:16,752 - DEBUG - http://localhost:8001 "GET /tts/speakers HTTP/1.1" 200 10976
2024-06-23 15:41:16,756 - DEBUG - Starting new HTTP connection (1): localhost:8001
2024-06-23 15:41:19,563 - DEBUG - http://localhost:8001 "POST /tts/language HTTP/1.1" 200 0
2024-06-23 15:41:20,572 - DEBUG - Starting new HTTP connection (1): localhost:8001
2024-06-23 15:41:22,587 - DEBUG - http://localhost:8001 "GET /tts/speakers HTTP/1.1" 200 10976
DeDRM v7.2.1: Trying to decrypt V.epub
DeDRM v7.2.1: Verifying zip archive integrity
DeDRM v7.2.1: “V.epub” is neither an Adobe Adept nor a Barnes & Noble encrypted ePub
Running file type plugin DeDRM failed with traceback:
Traceback (most recent call last):
  File "calibre\customize\ui.py", line 187, in _run_filetype_plugins
  File "calibre_plugins.dedrm.__init__", line 644, in run
  File "calibre_plugins.dedrm.__init__", line 420, in ePubDecrypt
calibre_plugins.dedrm.DeDRMError: DeDRM v7.2.1: Couldn't decrypt after 0.6 seconds. DRM free perhaps?
DeDRM v7.2.1: Trying to decrypt V.epub
DeDRM v7.2.1: Verifying zip archive integrity
DeDRM v7.2.1: “V.epub” is neither an Adobe Adept nor a Barnes & Noble encrypted ePub
Running file type plugin DeDRM failed with traceback:
Traceback (most recent call last):
  File "calibre\customize\ui.py", line 187, in _run_filetype_plugins
  File "calibre_plugins.dedrm.__init__", line 644, in run
  File "calibre_plugins.dedrm.__init__", line 420, in ePubDecrypt
calibre_plugins.dedrm.DeDRMError: DeDRM v7.2.1: Couldn't decrypt after 0.2 seconds. DRM free perhaps?
1% Converting input to HTML...
InputFormatPlugin: EPUB Input running
on C:\Users\Trism\Downloads\V.epub
Found HTML cover OEBPS/1-cover.xhtml
Parsing all content...
34% Running transforms on e-book...
Merging user specified metadata...
Detecting structure...

        [**All detected chapters**]

Flattening CSS and remapping font sizes...
Source base font size is 12.00000pt
Removing fake margins...
Cleaning up manifest...
Trimming unused files from manifest...
Trimming 'OEBPS/v2/logo2.jpg' from manifest
Trimming 'OEBPS/v1/logo2.jpg' from manifest
Trimming 'OEBPS/v3/logo.jpg' from manifest
Trimming 'OEBPS/v2/logo.jpg' from manifest
Trimming 'OEBPS/v3/ad.jpg' from manifest
Creating TXT Output...
67% Running TXT Output plugin
Converting XHTML to TXT...
TXT output written to C:\Users\Trism\Pandrator\Pandrator\Outputs\V\V.txt
Output saved to   C:\Users\Trism\Pandrator\Pandrator\Outputs\V\V.txt
2024-06-23 15:43:55,333 - DEBUG - Starting new HTTP connection (1): localhost:8001
2024-06-23 15:43:57,368 - DEBUG - http://localhost:8001 "GET /docs HTTP/1.1" 200 939

Silero server:

C:\Users\Trism\Pandrator\Silero TTS>python -m silero_api_server
C:\Users\Trism\scoop\apps\python\current\Lib\site-packages\silero_api_server\tts.py:26: UserWarning: torchaudio._backend.set_audio_backend has been deprecated. With dispatcher enabled, this function is no-op. You can remove the function call.
  torchaudio.set_audio_backend("soundfile")
2024-06-23 15:34:21.277 | INFO     | silero_api_server.tts:__init__:33 - TTS Service loaded successfully
2024-06-23 15:34:21.278 | INFO     | silero_api_server.tts:list_languages:149 - Loading remote language index
2024-06-23 15:34:28.790 | WARNING  | silero_api_server.tts:load_model:58 - Downloading Silero v3_en.pt model...
100%|█████████████████████████████████████████████████████████████████████████████| 54.5M/54.5M [00:19<00:00, 2.90MB/s]
2024-06-23 15:34:48.713 | INFO     | silero_api_server.tts:load_model:61 - Model download completed.
2024-06-23 15:34:49.409 | INFO     | silero_api_server.server:<module>:26 - Samples empty, generating new samples.
2024-06-23 15:34:49.410 | WARNING  | silero_api_server.tts:generate_samples:123 - Removing current samples
2024-06-23 15:34:49.410 | INFO     | silero_api_server.tts:generate_samples:127 - Creating new samples. This should take a minute...
Generated new voice
2024-06-23 15:37:07.622 | INFO     | silero_api_server.tts:generate_samples:134 - New samples created
INFO:     Started server process [22564]
INFO:     Waiting for application startup.
INFO:     Application startup complete.
INFO:     Uvicorn running on http://0.0.0.0:8001 (Press CTRL+C to quit)
INFO:     127.0.0.1:49459 - "POST /tts/language HTTP/1.1" 200 OK
INFO:     127.0.0.1:49463 - "GET /tts/speakers HTTP/1.1" 200 OK
INFO:     127.0.0.1:49469 - "POST /tts/language HTTP/1.1" 200 OK
INFO:     127.0.0.1:49478 - "GET /tts/speakers HTTP/1.1" 200 OK
INFO:     127.0.0.1:49738 - "GET /docs HTTP/1.1" 200 OK
Siddharth-Latthe-07 commented 2 months ago

@Troyificus The issue you're experiencing seems to be related to the GUI freezing or becoming unresponsive during the text-to-speech (TTS) generation process. try out these steps, and let me know, if it is resolved or not

  1. check the RAM and storage capacity of your system, through task manager
  2. Increase Timeout and Logging: Sometimes the TTS generation process might be taking longer than expected. Increasing the timeout settings and enabling more detailed logging can help identify if the process is timing out or where it might be getting stuck. some thing like this:-
    
    import requests
    import logging

logging.basicConfig(level=logging.DEBUG)

Example of increasing timeout

response = requests.post("http://localhost:8001/tts/language", timeout=120)


3. You may try to run the tts process separately from gui, in order to have more detailed analysis of results and issues. Also, check and update the libraries to latest version

plz do let me know, if the above work
Thanks
lukaszliniewicz commented 2 weeks ago

What is the format of the source file and how big is it? Preprocessing may take a long time for very large PDF files, for example.