zibojia / COCOCO

Video-Inpaint-Anything: This is the inference code for our paper CoCoCo: Improving Text-Guided Video Inpainting for Better Consistency, Controllability and Compatibility.
https://zibojia.github.io
272 stars 5 forks source link

runtime error #4

Closed mychina75 closed 2 months ago

mychina75 commented 2 months ago

Hello, I try to run the code: python app.py. after finish run sam2's tracking and start run "Inpainting", following error reported. would you please check? thank you!

####################################################### 0%| | 0/50 [00:00<?, ?it/s] Traceback (most recent call last): File "/data/lib/python3.10/site-packages/gradio/routes.py", line 439, in run_predict output = await app.get_blocks().process_api( File "/data/lib/python3.10/site-packages/gradio/blocks.py", line 1384, in process_api result = await self.call_function( File "/data/lib/python3.10/site-packages/gradio/blocks.py", line 1089, in call_function prediction = await anyio.to_thread.run_sync( File "/data/lib/python3.10/site-packages/anyio/to_thread.py", line 56, in run_sync return await get_async_backend().run_sync_in_worker_thread( File "/data/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 2177, in run_sync_in_worker_thread return await future File "/data/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 859, in run result = context.run(func, args) File "/data/lib/python3.10/site-packages/gradio/utils.py", line 700, in wrapper response = f(args, kwargs) File "/data/codes/video/COCOCO/app.py", line 306, in inpaint_video images = generate_frames(\ File "/data/codes/video/COCOCO/utils.py", line 133, in generate_frames videos, masked_videos, recon_videos = validation_pipeline( File "/data/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context return func(*args, *kwargs) File "/data/codes/video/COCOCO/cococo/pipelines/pipeline_animation_inpainting_cross_attention_vae.py", line 437, in call noise_pred_uncond = self.unet(latent_model_input[0:1], masked_image_model_input[0:1], \ File "/data/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1532, in _wrapped_call_impl return self._call_impl(args, kwargs) File "/data/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1541, in _call_impl return forward_call(*args, kwargs) File "/data/codes/video/COCOCO/cococo/models/unet.py", line 421, in forward sample, res_samples = downsample_block( File "/data/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1532, in _wrapped_call_impl return self._call_impl(*args, *kwargs) File "/data/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1541, in _call_impl return forward_call(args, kwargs) File "/data/codes/video/COCOCO/cococo/models/unet_blocks.py", line 413, in forward hidden_states = motion_module(hidden_states, File "/data/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1532, in _wrapped_call_impl return self._call_impl(*args, kwargs) File "/data/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1541, in _call_impl return forward_call(*args, *kwargs) File "/data/codes/video/COCOCO/cococo/models/motion_module.py", line 86, in forward hidden_states = self.temporal_transformer(hidden_states, encoder_hidden_states, vision_encoder_hidden_states, attention_mask) File "/data/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1532, in _wrapped_call_impl return self._call_impl(args, kwargs) File "/data/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1541, in _call_impl return forward_call(*args, kwargs) File "/data/codes/video/COCOCO/cococo/models/motion_module.py", line 161, in forward hidden_states = block(hidden_states, encoder_hidden_states=encoder_hidden_states, \ File "/data/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1532, in _wrapped_call_impl return self._call_impl(*args, *kwargs) File "/data/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1541, in _call_impl return forward_call(args, kwargs) File "/data/codes/video/COCOCO/cococo/models/motion_module.py", line 268, in forward hidden_states = attention_block( File "/data/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1532, in _wrapped_call_impl return self._call_impl(*args, kwargs) File "/data/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1541, in _call_impl return forward_call(*args, *kwargs) File "/data/codes/video/COCOCO/cococo/models/motion_module.py", line 623, in forward encoder_hidden_states = self.pos_encoder(encoder_hidden_states) File "/data/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1532, in _wrapped_call_impl return self._call_impl(args, kwargs) File "/data/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1541, in _call_impl return forward_call(*args, **kwargs) File "/data/codes/video/COCOCO/cococo/models/motion_module.py", line 300, in forward x = x + self.pe[:, :x.size(1)] RuntimeError: The size of tensor a (11088) must match the size of tensor b (4800) at non-singleton dimension 1

zibojia commented 2 months ago

I will fix this bugs.

zibojia commented 2 months ago

update done

mychina75 commented 2 months ago

Thank you for updating. But looks like issue still exist. ################################################################### (44, 360, 640, 3) (44, 360, 640, 1)

0%| | 0/50 [00:00<?, ?it/s] Traceback (most recent call last): File "/data/lib/python3.10/site-packages/gradio/routes.py", line 439, in run_predict output = await app.get_blocks().process_api( File "/data/lib/python3.10/site-packages/gradio/blocks.py", line 1384, in process_api result = await self.call_function( File "/data/lib/python3.10/site-packages/gradio/blocks.py", line 1089, in call_function prediction = await anyio.to_thread.run_sync( File "/data/lib/python3.10/site-packages/anyio/to_thread.py", line 56, in run_sync return await get_async_backend().run_sync_in_worker_thread( File "/data/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 2177, in run_sync_in_worker_thread return await future File "/data/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 859, in run result = context.run(func, args) File "/data/lib/python3.10/site-packages/gradio/utils.py", line 700, in wrapper response = f(args, kwargs) File "/data/codes/video/COCOCO-main/app.py", line 309, in inpaint_video images = generate_frames(\ File "/data/codes/video/COCOCO-main/utils.py", line 131, in generate_frames videos, masked_videos, recon_videos = validation_pipeline( File "/data/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context return func(*args, *kwargs) File "/data/codes/video/COCOCO-main/cococo/pipelines/pipeline_animation_inpainting_cross_attention_vae.py", line 437, in call noise_pred_uncond = self.unet(latent_model_input[0:1], masked_image_model_input[0:1], \ File "/data/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1532, in _wrapped_call_impl return self._call_impl(args, kwargs) File "/data/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1541, in _call_impl return forward_call(*args, kwargs) File "/data/codes/video/COCOCO-main/cococo/models/unet.py", line 417, in forward sample, res_samples = downsample_block( File "/data/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1532, in _wrapped_call_impl return self._call_impl(*args, *kwargs) File "/data/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1541, in _call_impl return forward_call(args, kwargs) File "/data/codes/video/COCOCO-main/cococo/models/unet_blocks.py", line 413, in forward hidden_states = motion_module(hidden_states, File "/data/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1532, in _wrapped_call_impl return self._call_impl(*args, kwargs) File "/data/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1541, in _call_impl return forward_call(*args, *kwargs) File "/data/codes/video/COCOCO-main/cococo/models/motion_module.py", line 86, in forward hidden_states = self.temporal_transformer(hidden_states, encoder_hidden_states, vision_encoder_hidden_states, attention_mask) File "/data/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1532, in _wrapped_call_impl return self._call_impl(args, kwargs) File "/data/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1541, in _call_impl return forward_call(*args, kwargs) File "/data/codes/video/COCOCO-main/cococo/models/motion_module.py", line 161, in forward hidden_states = block(hidden_states, encoder_hidden_states=encoder_hidden_states, \ File "/data/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1532, in _wrapped_call_impl return self._call_impl(*args, *kwargs) File "/data/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1541, in _call_impl return forward_call(args, kwargs) File "/data/codes/video/COCOCO-main/cococo/models/motion_module.py", line 268, in forward hidden_states = attention_block( File "/data/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1532, in _wrapped_call_impl return self._call_impl(*args, kwargs) File "/data/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1541, in _call_impl return forward_call(*args, *kwargs) File "/data/codes/video/COCOCO-main/cococo/models/motion_module.py", line 623, in forward encoder_hidden_states = self.pos_encoder(encoder_hidden_states) File "/data/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1532, in _wrapped_call_impl return self._call_impl(args, kwargs) File "/data/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1541, in _call_impl return forward_call(*args, **kwargs) File "/data/codes/video/COCOCO-main/cococo/models/motion_module.py", line 300, in forward x = x + self.pe[:, :x.size(1)] RuntimeError: The size of tensor a (11088) must match the size of tensor b (10000) at non-singleton dimension 1

zibojia commented 2 months ago

You can use lower resolution, I think 400x400 is good for our model. Moreover, I will update my inference code in a few days to support any length.

mychina75 commented 2 months ago

Thank you. works for me with smaller video size.