voice-cloning-app / Voice-Cloning-App

A Python/Pytorch app for easily synthesising human voices
BSD 3-Clause "New" or "Revised" License
1.41k stars 236 forks source link

Error: The expanded size of the tensor must match the existing size at non-singleton dimension 0. #17

Closed AlessandroSpallina closed 3 years ago

AlessandroSpallina commented 3 years ago

Hi all, using the latest version (v0.5.2) of the windows executable i get the error in the image when I try to create a new dataset.

Cattura

My logs:

[14596] WARNING: file already exists but should not: C:\Users\SK3LA\AppData\Local\Temp\_MEI145962\torch\_C.cp38-win_amd64.pyd
Server initialized for threading.
Server initialized for threading.
pydub\utils.py:170: RuntimeWarning: Couldn't find ffmpeg or avconv - defaulting to ffmpeg, but may not work
torchaudio\extension\extension.py:14: UserWarning: torchaudio C++ extension is not available.
torchaudio\backend\utils.py:63: UserWarning: The interface of "soundfile" backend is planned to change in 0.8.0 to match that of "sox_io" backend and the current interface will be removed in 0.9.0. To use the new interface, do `torchaudio.USE_SOUNDFILE_LEGACY_INTERFACE = False` before setting the backend to "soundfile". Please refer to https://github.com/pytorch/audio/issues/903 for the detail.
INFO:matplotlib.font_manager:Generating new fontManager, this may take some time...
[nltk_data] Downloading package wordnet to
[nltk_data]     C:\Users\SK3LA\AppData\Local\Temp\_MEI145962\nltk_data
[nltk_data]     ...
[nltk_data]   Package wordnet is already up-to-date!
WARNING:werkzeug:WebSocket transport not available. Install eventlet or gevent and gevent-websocket for improved performance.
 * Serving Flask app "main" (lazy loading)
 * Environment: production
   WARNING: This is a development server. Do not use it in a production deployment.
   Use a production WSGI server instead.
 * Debug mode: off
INFO:werkzeug: * Running on http://127.0.0.1:5000/ (Press CTRL+C to quit)
INFO:werkzeug:127.0.0.1 - - [08/Apr/2021 12:31:21] "GET / HTTP/1.1" 200 -
Starting Thread
INFO:werkzeug:127.0.0.1 - - [08/Apr/2021 12:31:49] "POST / HTTP/1.1" 200 -
Xv0S66KHXNhGzf-3AAAA: Sending packet OPEN data {'sid': 'Xv0S66KHXNhGzf-3AAAA', 'upgrades': [], 'pingTimeout': 5000, 'pingInterval': 25000}
INFO:engineio.server:Xv0S66KHXNhGzf-3AAAA: Sending packet OPEN data {'sid': 'Xv0S66KHXNhGzf-3AAAA', 'upgrades': [], 'pingTimeout': 5000, 'pingInterval': 25000}
INFO:werkzeug:127.0.0.1 - - [08/Apr/2021 12:31:50] "GET /socket.io/?EIO=4&transport=polling&t=NYn2S3N HTTP/1.1" 200 -
Xv0S66KHXNhGzf-3AAAA: Received packet MESSAGE data 0/voice,
INFO:engineio.server:Xv0S66KHXNhGzf-3AAAA: Received packet MESSAGE data 0/voice,
Xv0S66KHXNhGzf-3AAAA: Sending packet MESSAGE data 0/voice,{"sid":"NWqGzbr7-JMRmUfgAAAB"}
INFO:engineio.server:Xv0S66KHXNhGzf-3AAAA: Sending packet MESSAGE data 0/voice,{"sid":"NWqGzbr7-JMRmUfgAAAB"}
INFO:werkzeug:127.0.0.1 - - [08/Apr/2021 12:31:50] "POST /socket.io/?EIO=4&transport=polling&t=NYn2S8I&sid=Xv0S66KHXNhGzf-3AAAA HTTP/1.1" 200 -
INFO:werkzeug:127.0.0.1 - - [08/Apr/2021 12:31:50] "GET /socket.io/?EIO=4&transport=polling&t=NYn2S8I.0&sid=Xv0S66KHXNhGzf-3AAAA HTTP/1.1" 200 -
emitting event "logs" to all [/voice]
INFO:socketio.server:emitting event "logs" to all [/voice]
Xv0S66KHXNhGzf-3AAAA: Sending packet MESSAGE data 2/voice,["logs",{"text":"Coverting data\\datasets\\Fedez\\audio.mp3..."}]
INFO:werkzeug:127.0.0.1 - - [08/Apr/2021 12:31:54] "GET /socket.io/?EIO=4&transport=polling&t=NYn2SDC&sid=Xv0S66KHXNhGzf-3AAAA HTTP/1.1" 200 -
INFO:engineio.server:Xv0S66KHXNhGzf-3AAAA: Sending packet MESSAGE data 2/voice,["logs",{"text":"Coverting data\\datasets\\Fedez\\audio.mp3..."}]
INFO:voice:Coverting data\datasets\Fedez\audio.mp3...
emitting event "logs" to all [/voice]
INFO:socketio.server:emitting event "logs" to all [/voice]
Xv0S66KHXNhGzf-3AAAA: Sending packet MESSAGE data 2/voice,["logs",{"text":"Loading script from data\\datasets\\Fedez\\text.txt..."}]
INFO:werkzeug:127.0.0.1 - - [08/Apr/2021 12:31:55] "GET /socket.io/?EIO=4&transport=polling&t=NYn2TFh&sid=Xv0S66KHXNhGzf-3AAAA HTTP/1.1" 200 -
INFO:engineio.server:Xv0S66KHXNhGzf-3AAAA: Sending packet MESSAGE data 2/voice,["logs",{"text":"Loading script from data\\datasets\\Fedez\\text.txt..."}]
INFO:voice:Loading script from data\datasets\Fedez\text.txt...
emitting event "logs" to all [/voice]
INFO:socketio.server:emitting event "logs" to all [/voice]
Xv0S66KHXNhGzf-3AAAA: Sending packet MESSAGE data 2/voice,["logs",{"text":"Searching text for matching fragments..."}]
INFO:engineio.server:Xv0S66KHXNhGzf-3AAAA: Sending packet MESSAGE data 2/voice,["logs",{"text":"Searching text for matching fragments..."}]
INFO:voice:Searching text for matching fragments...
emitting event "logs" to all [/voice]
INFO:werkzeug:127.0.0.1 - - [08/Apr/2021 12:31:55] "GET /socket.io/?EIO=4&transport=polling&t=NYn2TRM&sid=Xv0S66KHXNhGzf-3AAAA HTTP/1.1" 200 -
INFO:socketio.server:emitting event "logs" to all [/voice]
Xv0S66KHXNhGzf-3AAAA: Sending packet MESSAGE data 2/voice,["logs",{"text":"Changing sample rate..."}]
INFO:engineio.server:Xv0S66KHXNhGzf-3AAAA: Sending packet MESSAGE data 2/voice,["logs",{"text":"Changing sample rate..."}]
INFO:voice:Changing sample rate...
emitting event "logs" to all [/voice]
INFO:socketio.server:emitting event "logs" to all [/voice]
Xv0S66KHXNhGzf-3AAAA: Sending packet MESSAGE data 2/voice,["logs",{"text":"Fetching segments..."}]
INFO:engineio.server:Xv0S66KHXNhGzf-3AAAA: Sending packet MESSAGE data 2/voice,["logs",{"text":"Fetching segments..."}]
INFO:voice:Fetching segments...
INFO:werkzeug:127.0.0.1 - - [08/Apr/2021 12:31:55] "GET /socket.io/?EIO=4&transport=polling&t=NYn2TRY&sid=Xv0S66KHXNhGzf-3AAAA HTTP/1.1" 200 -
emitting event "logs" to all [/voice]
INFO:socketio.server:emitting event "logs" to all [/voice]
Xv0S66KHXNhGzf-3AAAA: Sending packet MESSAGE data 2/voice,["logs",{"text":"Matching segments..."}]
INFO:engineio.server:Xv0S66KHXNhGzf-3AAAA: Sending packet MESSAGE data 2/voice,["logs",{"text":"Matching segments..."}]
INFO:voice:Matching segments...
emitting event "logs" to all [/voice]
INFO:socketio.server:emitting event "logs" to all [/voice]
Xv0S66KHXNhGzf-3AAAA: Sending packet MESSAGE data 2/voice,["logs",{"text":"Generating segments..."}]
INFO:engineio.server:Xv0S66KHXNhGzf-3AAAA: Sending packet MESSAGE data 2/voice,["logs",{"text":"Generating segments..."}]
INFO:voice:Generating segments...
Using cache found in C:\Users\SK3LA/.cache\torch\hub\snakers4_silero-models_master
INFO:werkzeug:127.0.0.1 - - [08/Apr/2021 12:31:55] "GET /socket.io/?EIO=4&transport=polling&t=NYn2TW_&sid=Xv0S66KHXNhGzf-3AAAA HTTP/1.1" 200 -
emitting event "error" to all [/voice]
INFO:socketio.server:emitting event "error" to all [/voice]
Xv0S66KHXNhGzf-3AAAA: Sending packet MESSAGE data 2/voice,["error",{"type":"RuntimeError","text":"The expanded size of the tensor (12800) must match the existing size (0) at non-singleton dimension 0.  Target sizes: [12800].  Tensor sizes: [0]","stacktrace":"Traceback (most recent call last):\n  File \"application\\utils.py\", line 95, in background_task\n  File \"application\\utils.py\", line 42, in create_dataset\n  File \"dataset\\clip_generator.py\", line 53, in clip_generator\n  File \"dataset\\forced_alignment\\align.py\", line 66, in process_segments\n  File \"dataset\\transcribe.py\", line 32, in transcribe\n  File \"dataset\\transcribe.py\", line 18, in load_audio\nRuntimeError: The expanded size of the tensor (12800) must match the existing size (0) at non-singleton dimension 0.  Target sizes: [12800].  Tensor sizes: [0]\n"}]
INFO:werkzeug:127.0.0.1 - - [08/Apr/2021 12:32:00] "GET /socket.io/?EIO=4&transport=polling&t=NYn2TbB&sid=Xv0S66KHXNhGzf-3AAAA HTTP/1.1" 200 -
INFO:engineio.server:Xv0S66KHXNhGzf-3AAAA: Sending packet MESSAGE data 2/voice,["error",{"type":"RuntimeError","text":"The expanded size of the tensor (12800) must match the existing size (0) at non-singleton dimension 0.  Target sizes: [12800].  Tensor sizes: [0]","stacktrace":"Traceback (most recent call last):\n  File \"application\\utils.py\", line 95, in background_task\n  File \"application\\utils.py\", line 42, in create_dataset\n  File \"dataset\\clip_generator.py\", line 53, in clip_generator\n  File \"dataset\\forced_alignment\\align.py\", line 66, in process_segments\n  File \"dataset\\transcribe.py\", line 32, in transcribe\n  File \"dataset\\transcribe.py\", line 18, in load_audio\nRuntimeError: The expanded size of the tensor (12800) must match the existing size (0) at non-singleton dimension 0.  Target sizes: [12800].  Tensor sizes: [0]\n"}]
The expanded size of the tensor (12800) must match the existing size (0) at non-singleton dimension 0.  Target sizes: [12800].  Tensor sizes: [0]
Xv0S66KHXNhGzf-3AAAA: Sending packet PING data None
INFO:engineio.server:Xv0S66KHXNhGzf-3AAAA: Sending packet PING data None
Xv0S66KHXNhGzf-3AAAA: Client is gone, closing socket
INFO:engineio.server:Xv0S66KHXNhGzf-3AAAA: Client is gone, closing socket
Xv0S66KHXNhGzf-3AAAA: Client is gone, closing socket
INFO:engineio.server:Xv0S66KHXNhGzf-3AAAA: Client is gone, closing socket
Invalid session Xv0S66KHXNhGzf-3AAAA (further occurrences of this error will be logged with level INFO)
ERROR:engineio.server:Invalid session Xv0S66KHXNhGzf-3AAAA (further occurrences of this error will be logged with level INFO)
INFO:werkzeug:127.0.0.1 - - [08/Apr/2021 12:32:30] "POST /socket.io/?EIO=4&transport=polling&t=NYn2byF&sid=Xv0S66KHXNhGzf-3AAAA HTTP/1.1" 400 -
INFO:werkzeug:127.0.0.1 - - [08/Apr/2021 12:32:30] "GET /socket.io/?EIO=4&transport=polling&t=NYn2bks&sid=Xv0S66KHXNhGzf-3AAAA HTTP/1.1" 500 -
ERROR:werkzeug:Error on request:
Traceback (most recent call last):
  File "werkzeug\serving.py", line 323, in run_wsgi

  File "werkzeug\serving.py", line 312, in execute

  File "flask\app.py", line 2464, in __call__

  File "flask_socketio\__init__.py", line 45, in __call__

  File "engineio\middleware.py", line 60, in __call__

  File "socketio\server.py", line 571, in handle_request

  File "engineio\server.py", line 390, in handle_request

  File "engineio\server.py", line 612, in _get_socket

KeyError: 'Session is disconnected'

I'm trying to create an ITALIAN dataset with 5 minutes audio, following the audio and the text.

[Italian] Fedez Lite Con Morgan a X Factor [DownSub.com].txt The mp3 audio of this video https://www.youtube.com/watch?v=HQ_HAZS7DkE

AlessandroSpallina commented 3 years ago

after that error message the app is frozen

BenAAndrew commented 3 years ago

Hi @AlessandroSpallina, Italian is currently not supported but may be in the future. The bug you have however seems unrelated and I will investigate in https://github.com/BenAAndrew/Voice-Cloning-App/issues/11

BenAAndrew commented 3 years ago

Closing as will update in https://github.com/BenAAndrew/Voice-Cloning-App/issues/11