BadToBest / EchoMimic

Lifelike Audio-Driven Portrait Animations through Editable Landmark Conditioning
https://badtobest.github.io/echomimic.html
Apache License 2.0
2.37k stars 277 forks source link

执行报错:The size of tensor a (64) must match the size of tensor b (128) at non-singleton dimension 4 #142

Open yeohx opened 3 weeks ago

yeohx commented 3 weeks ago

The size of tensor a (64) must match the size of tensor b (128) at non-singleton dimension 4 我上传的文件是png格式.500K左右 音频是flac格式 1.8M左右 执行过程: 启动服务:python3 -u webgui.py --server_port=3000 通过页面上传图片和音频 图片: img-jIpz5hAkPxsl0TKWGuzffkWz

信息: To create a public link, set share=True in launch(). video in 24 FPS, audio idx in 50FPS whisper_chunks: (266, 50, 384) audio_fea_final: torch.Size([1, 266, 50, 384]) ref_image_latents shape: torch.Size([1, 4, 64, 64]) face_mask_tensor shape: torch.Size([1, 1, 1, 1024, 1024]) face_locator_tensor shape: torch.Size([2, 320, 1, 128, 128]) 0%| | 0/30 [00:00<?, ?it/s] Traceback (most recent call last): File "/root/miniconda3/envs/echomimic/lib/python3.10/site-packages/gradio/queueing.py", line 536, in process_events response = await route_utils.call_process_api( File "/root/miniconda3/envs/echomimic/lib/python3.10/site-packages/gradio/route_utils.py", line 288, in call_process_api output = await app.get_blocks().process_api( File "/root/miniconda3/envs/echomimic/lib/python3.10/site-packages/gradio/blocks.py", line 1931, in process_api result = await self.call_function( File "/root/miniconda3/envs/echomimic/lib/python3.10/site-packages/gradio/blocks.py", line 1516, in call_function prediction = await anyio.to_thread.run_sync( # type: ignore File "/root/miniconda3/envs/echomimic/lib/python3.10/site-packages/anyio/to_thread.py", line 56, in run_sync return await get_async_backend().run_sync_in_worker_thread( File "/root/miniconda3/envs/echomimic/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 2177, in run_sync_in_worker_thread return await future File "/root/miniconda3/envs/echomimic/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 859, in run result = context.run(func, args) File "/root/miniconda3/envs/echomimic/lib/python3.10/site-packages/gradio/utils.py", line 826, in wrapper response = f(args, kwargs) File "/root/EchoMimic/webgui.py", line 233, in generate_video final_output_path = process_video( File "/root/EchoMimic/webgui.py", line 175, in process_video video = pipe( File "/root/miniconda3/envs/echomimic/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context return func(*args, *kwargs) File "/root/EchoMimic/src/pipelines/pipeline_echo_mimic.py", line 507, in call pred = self.denoising_unet( File "/root/miniconda3/envs/echomimic/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1511, in _wrapped_call_impl return self._call_impl(args, kwargs) File "/root/miniconda3/envs/echomimic/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1520, in _call_impl return forward_call(*args, **kwargs) File "/root/EchoMimic/src/models/unet_3d_echo.py", line 494, in forward sample = sample + face_musk_fea RuntimeError: The size of tensor a (64) must match the size of tensor b (128) at non-singleton dimension 4

yiyijiu92 commented 3 weeks ago

same problem +1

XeoOuYang commented 3 weeks ago

试一下这个方法 A bug was found here, some jpeg format found no face here #137

mythzhang8 commented 2 weeks ago

我也有这个问题

ZW1396 commented 2 weeks ago

me too

agchaowanhui commented 3 days ago

det_bboxes, probs = face_detector.detect(cv2.cvtColor(face_img, cv2.COLOR_BGR2RGB))