AIFSH / ComfyUI-XTTS

a custom comfyui node for coqui-ai/TTS's xtts module! support 17 languages voice cloning and tts
Mozilla Public License 2.0
24 stars 5 forks source link

TypeError: Invalid file: {'waveform': tensor #8

Open dancemanUK opened 1 week ago

dancemanUK commented 1 week ago

Loading model... Computing speaker latents... !!! Exception during processing!!! Invalid file: {'waveform': tensor([[[ 0.0042, 0.0042, 0.0044, ..., -0.1867, -0.1892, -0.1906], [-0.0138, -0.0141, -0.0141, ..., -0.2047, -0.2070, -0.2084]]]), 'sample_rate': 48000} Traceback (most recent call last): File "F:\ComfyUI_windows_portable_cu121\ComfyUI\execution.py", line 151, in recursive_execute output_data, output_ui = get_output_data(obj, input_data_all) File "F:\ComfyUI_windows_portable_cu121\ComfyUI\execution.py", line 81, in get_output_data return_values = map_node_over_list(obj, input_data_all, obj.FUNCTION, allow_interrupt=True) File "F:\ComfyUI_windows_portable_cu121\ComfyUI\execution.py", line 74, in map_node_over_list results.append(getattr(obj, func)(*slice_dict(input_data_all, i))) File "F:\ComfyUI_windows_portable_cu121\ComfyUI\custom_nodes\ComfyUI-XTTS\nodes.py", line 252, in get_wav_tts gpt_cond_latent, speaker_embedding = model.get_conditioning_latents(audio_path=[audio]) File "F:\ComfyUI_windows_portable_cu121\python_embeded\lib\site-packages\torch\utils_contextlib.py", line 115, in decorate_context return func(args, **kwargs) File "F:\ComfyUI_windows_portable_cu121\ComfyUI\custom_nodes\ComfyUI-XTTS\TTS\tts\models\xtts.py", line 357, in get_conditioning_latents audio = load_audio(file_path, load_sr) File "F:\ComfyUI_windows_portable_cu121\ComfyUI\custom_nodes\ComfyUI-XTTS\TTS\tts\models\xtts.py", line 73, in load_audio audio, lsr = torchaudio.load(audiopath) File "F:\ComfyUI_windows_portable_cu121\python_embeded\lib\site-packages\torchaudio_backend\utils.py", line 205, in load return backend.load(uri, frame_offset, num_frames, normalize, channels_first, format, buffer_size) File "F:\ComfyUI_windows_portable_cu121\python_embeded\lib\site-packages\torchaudio_backend\soundfile.py", line 27, in load return soundfile_backend.load(uri, frame_offset, num_frames, normalize, channels_first, format) File "F:\ComfyUI_windows_portable_cu121\python_embeded\lib\site-packages\torchaudio_backend\soundfilebackend.py", line 221, in load with soundfile.SoundFile(filepath, "r") as file: File "F:\ComfyUI_windows_portable_cu121\python_embeded\lib\site-packages\soundfile.py", line 658, in init self._file = self._open(file, mode_int, closefd) File "F:\ComfyUI_windows_portable_cu121\python_embeded\lib\site-packages\soundfile.py", line 1212, in _open raise TypeError("Invalid file: {0!r}".format(self.name)) TypeError: Invalid file: {'waveform': tensor([[[ 0.0042, 0.0042, 0.0044, ..., -0.1867, -0.1892, -0.1906], [-0.0138, -0.0141, -0.0141, ..., -0.2047, -0.2070, -0.2084]]]), 'sample_rate': 48000}

dancemanUK commented 1 week ago

大佬人的微信群,二维码过期了,不是能加入咨询

iEddie-cmd commented 4 days ago

Same Issue!

zmwv823 commented 1 day ago

不能用comfyui自带的加载音频,得用插件带的。 comfyui输出的是音频文件转换为tensor之后的数据(tensor),插件输出的是音频文件的路径文本值(string)。 除非加判定,输入的是tensor则保存一份临时wav文件,输入string则直接读取。