Closed tom2698 closed 1 month ago
Nevermind. Am a silly goose and didnt install it properly.
@tom2698 what didn't you install properly? im running into the same issue.
@tom2698 what didn't you install properly? im running into the same issue.
I forget exactly. I vaguely remember missing a step in the instructions though. So try running through the instructions again and make sure you have the correct versions of everything.
🔴 If you have installed AllTalk in a custom Python environment, I will only be able to provide limited assistance/support. AllTalk draws on a variety of scripts and libraries that are not written or managed by myself, and they may fail, error or give strange results in custom built python environments.
🔴 Please generate a diagnostics report and upload the "diagnostics.log" as this helps me understand your configuration.
https://github.com/erew123/alltalk_tts/tree/main?#-how-to-make-a-diagnostics-report-file
Describe the bug After opening finetune.py and clicking create dataset it returns this error. RuntimeError: Failed to open the input ".../alltalk_tts/finetune/tmp-trn/temp/custom_tempfile_1717569516_514.wav" (Invalid data found when processing input). I tried downgrading ffmpeg version to 6.0-16.fc39 from 6.1.1-5.fc9 but same issue. Running on Fedora Linux
To Reproduce Steps to reproduce the behaviour:
Screenshots If applicable, add screenshots to help explain your problem.
Text/logs The data processing was interrupted due an error !! Please check the console to verify the full error message! Error summary: Traceback (most recent call last): File "/models2/VoiceModels/alltalk/alltalk_tts/finetune.py", line 1395, in preprocess_dataset train_meta, eval_meta, audio_total_size = format_audio_list(target_language=language, whisper_model=whisper_model, out_path=out_path, eval_split_number=eval_split_number, speaker_name_input=speaker_name_input, gradio_progress=progress) File "/models2/VoiceModels/alltalk/alltalk_tts/finetune.py", line 385, in format_audio_list wav, sr = torchaudio.load(temp_audio_path, format="wav") File "/models2/VoiceModels/alltalk/alltalk_tts/venv/lib64/python3.10/site-packages/torchaudio/_backend/utils.py", line 205, in load return backend.load(uri, frame_offset, num_frames, normalize, channels_first, format, buffer_size) File "/models2/VoiceModels/alltalk/alltalk_tts/venv/lib64/python3.10/site-packages/torchaudio/_backend/ffmpeg.py", line 297, in load return load_audio(uri, frame_offset, num_frames, normalize, channels_first, format) File "/models2/VoiceModels/alltalk/alltalk_tts/venv/lib64/python3.10/site-packages/torchaudio/_backend/ffmpeg.py", line 88, in load_audio s = torchaudio.io.StreamReader(src, format, None, buffer_size) File "/models2/VoiceModels/alltalk/alltalk_tts/venv/lib64/python3.10/site-packages/torio/io/_streaming_media_decoder.py", line 526, in init self._be = ffmpeg_ext.StreamingMediaDecoder(os.path.normpath(src), format, option) RuntimeError: Failed to open the input "/models2/VoiceModels/alltalk/alltalk_tts/finetune/tmp-trn/temp/custom_tempfile_1717569516_514.wav" (Invalid data found when processing input). Exception raised from get_input_format_context at /__w/audio/audio/pytorch/audio/src/libtorio/ffmpeg/stream_reader/stream_reader.cpp:42 (most recent call first): frame #0: c10::Error::Error(c10::SourceLocation, std::string) + 0x57 (0x7f6a040cf897 in /models2/VoiceModels/alltalk/alltalk_tts/venv/lib64/python3.10/site-packages/torch/lib/libc10.so) frame #1: c10::detail::torchCheckFail(char const, char const, unsigned int, std::string const&) + 0x64 (0x7f6a0407fb25 in /models2/VoiceModels/alltalk/alltalk_tts/venv/lib64/python3.10/site-packages/torch/lib/libc10.so) frame #2: + 0x42334 (0x7f69ffecb334 in /models2/VoiceModels/alltalk/alltalk_tts/venv/lib/python3.10/site-packages/torio/lib/libtorio_ffmpeg6.so) frame #3: torio::io::StreamingMediaDecoder::StreamingMediaDecoder(std::string const&, std::optional const&, std::optional<std::map<std::string, std::string, std::less, std::allocator<std::pair<std::string const, std::string> > > > const&) + 0x14 (0x7f69ffecdd34 in /models2/VoiceModels/alltalk/alltalk_tts/venv/lib/python3.10/site-packages/torio/lib/libtorio_ffmpeg6.so) frame #4: + 0x3aa4e (0x7f694491aa4e in /models2/VoiceModels/alltalk/alltalk_tts/venv/lib64/python3.10/site-packages/torio/lib/_torio_ffmpeg6.so) frame #5: + 0x32617 (0x7f6944912617 in /models2/VoiceModels/alltalk/alltalk_tts/venv/lib64/python3.10/site-packages/torio/lib/_torio_ffmpeg6.so) frame #11: + 0xf6cb (0x7f6a061156cb in /models2/VoiceModels/alltalk/alltalk_tts/venv/lib64/python3.10/site-packages/torchaudio/lib/_torchaudio.so) frame #45: + 0x8e897 (0x7f6a52aac897 in /lib64/libc.so.6) frame #46: + 0x11580c (0x7f6a52b3380c in /lib64/libc.so.6)
Desktop (please complete the following information): AllTalk was updated: [approx. date]: Most recent Custom Python environment: [yes/no give details if yes]: Yes. The one that was used in the setup Text-generation-webUI was updated: [approx. date]: Most recent
Additional context Add any other context about the problem here.