ForeignGods / ComfyUI-Mana-Nodes

Font Animation, Automatic Speech Recognition and Text to Speech Custom Nodes for ComfyUI
MIT License
202 stars 12 forks source link

Can not connect the speech regonition node to any audio loaders (and can not figure out how to load audio only mp3/4) #30

Open GamingDaveUk opened 2 months ago

GamingDaveUk commented 2 months ago

Hi, over on reddit i asked if anyone knows a node that would allow me to load in a mp3 with spoken story/bio and have it create a cool video with the text or an avatar talking. https://www.reddit.com/r/comfyui/comments/1dy83u6/comment/lc7u3zc/?context=3

I was linked to your github and the demos looked promising. Got the nodes installed but I can not find a guide on how to use it. I tried the workflows, the second seems to be close to my goal.... mp3 --> extract the words --> to video with the words matched up to the audio.

However none of comfyui nodes that load audio could be linked to the audio file input of the speech recognition node. So i converted the mp3 to a mp4 using an online converter. sadly the split video node failed (likely as there is no video data... but the error code didnt say anything useful)

''' Error occurred when executing Split Video:

Error in file C:\AI\ComfyUI_windows_portable\ComfyUI\input\video\dave.mp4, Accessing time t=296.04-296.09 seconds, with clip duration=296 seconds,

File "C:\AI\ComfyUI_windows_portable\ComfyUI\execution.py", line 151, in recursive_execute output_data, output_ui = get_output_data(obj, input_data_all) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\AI\ComfyUI_windows_portable\ComfyUI\execution.py", line 81, in get_output_data return_values = map_node_over_list(obj, input_data_all, obj.FUNCTION, allow_interrupt=True) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\AI\ComfyUI_windows_portable\ComfyUI\execution.py", line 74, in map_node_over_list results.append(getattr(obj, func)(slice_dict(input_data_all, i))) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\AI\ComfyUI_windows_portable\ComfyUI\custom_nodes\ComfyUI-Mana-Nodes\nodes\video2audio_node.py", line 40, in run audio, fps = self.extract_audio_with_moviepy(video_path, kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\AI\ComfyUI_windows_portable\ComfyUI\custom_nodes\ComfyUI-Mana-Nodes\nodes\video2audio_node.py", line 71, in extract_audio_with_moviepy audio.write_audiofile(full_path) File "", line 2, in write_audiofile File "C:\AI\ComfyUI_windows_portable\python_embeded\Lib\site-packages\moviepy\decorators.py", line 54, in requires_duration return f(clip, *a, *k) ^^^^^^^^^^^^^^^^ File "C:\AI\ComfyUI_windows_portable\python_embeded\Lib\site-packages\moviepy\audio\AudioClip.py", line 206, in write_audiofile return ffmpeg_audiowrite(self, filename, fps, nbytes, buffersize, ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "", line 2, in ffmpeg_audiowrite File "C:\AI\ComfyUI_windows_portable\python_embeded\Lib\site-packages\moviepy\decorators.py", line 54, in requires_duration return f(clip, a, k) ^^^^^^^^^^^^^^^^ File "C:\AI\ComfyUI_windows_portable\python_embeded\Lib\site-packages\moviepy\audio\io\ffmpeg_audiowriter.py", line 166, in ffmpeg_audiowrite for chunk in clip.iter_chunks(chunksize=buffersize, File "C:\AI\ComfyUI_windows_portable\python_embeded\Lib\site-packages\moviepy\audio\AudioClip.py", line 85, in iter_chunks yield self.to_soundarray(tt, nbytes=nbytes, quantize=quantize, ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "", line 2, in to_soundarray File "C:\AI\ComfyUI_windows_portable\python_embeded\Lib\site-packages\moviepy\decorators.py", line 54, in requires_duration return f(clip, *a, k) ^^^^^^^^^^^^^^^^ File "C:\AI\ComfyUI_windows_portable\python_embeded\Lib\site-packages\moviepy\audio\AudioClip.py", line 127, in to_soundarray snd_array = self.get_frame(tt) ^^^^^^^^^^^^^^^^^^ File "", line 2, in get_frame File "C:\AI\ComfyUI_windows_portable\python_embeded\Lib\site-packages\moviepy\decorators.py", line 89, in wrapper return f(*new_a, *new_kw) ^^^^^^^^^^^^^^^^^^^ File "C:\AI\ComfyUI_windows_portable\python_embeded\Lib\site-packages\moviepy\Clip.py", line 93, in get_frame return self.make_frame(t) ^^^^^^^^^^^^^^^^^^ File "C:\AI\ComfyUI_windows_portable\python_embeded\Lib\site-packages\moviepy\Clip.py", line 136, in newclip = self.set_make_frame(lambda t: fun(self.get_frame, t)) ^^^^^^^^^^^^^^^^^^^^^^ File "C:\AI\ComfyUI_windows_portable\python_embeded\Lib\site-packages\moviepy\Clip.py", line 187, in return self.fl(lambda gf, t: gf(t_func(t)), apply_to, ^^^^^^^^^^^^^ File "", line 2, in get_frame File "C:\AI\ComfyUI_windows_portable\python_embeded\Lib\site-packages\moviepy\decorators.py", line 89, in wrapper return f(new_a, new_kw) ^^^^^^^^^^^^^^^^^^^ File "C:\AI\ComfyUI_windows_portable\python_embeded\Lib\site-packages\moviepy\Clip.py", line 93, in get_frame return self.make_frame(t) ^^^^^^^^^^^^^^^^^^ File "C:\AI\ComfyUI_windows_portable\python_embeded\Lib\site-packages\moviepy\audio\io\AudioFileClip.py", line 77, in self.make_frame = lambda t: self.reader.get_frame(t) ^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\AI\ComfyUI_windows_portable\python_embeded\Lib\site-packages\moviepy\audio\io\readers.py", line 170, in get_frame raise IOError("Error in file %s, "%(self.filename)+ '''

So how do we take a speech synthed audio file and turn it into a video with words like in your demo's?