Drexubery / ViewCrafter

Official implementation of "ViewCrafter: Taming Video Diffusion Models for High-fidelity Novel View Synthesis"
Apache License 2.0
881 stars 32 forks source link

Different aspect ratio #18

Closed sudonymously closed 1 month ago

sudonymously commented 1 month ago

Hi, When trying a different aspect ratio like 320x512 or 1024x576 (for portrait mode), i run into the error below. Are their tricks to run this model using a different aspect ratio?

RuntimeError: Sizes of tensors must match except in dimension 1. Expected size 40 but got size 72 for tensor number 1 in the list.

Full error

File "/src/utils/diffusion_utils.py", line 179, in image_guided_synthesis
samples, _ = ddim_sampler.sample(S=ddim_steps,
File "/root/.pyenv/versions/3.10.15/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context
return func(*args, **kwargs)
File "/src/lvdm/models/samplers/ddim.py", line 115, in sample
samples, intermediates = self.ddim_sampling(conditioning, size,
File "/root/.pyenv/versions/3.10.15/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context
return func(*args, **kwargs)
File "/src/lvdm/models/samplers/ddim.py", line 187, in ddim_sampling
outs = self.p_sample_ddim(img, cond, ts, index=index, use_original_steps=ddim_use_original_steps,
File "/root/.pyenv/versions/3.10.15/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context
return func(*args, **kwargs)
File "/src/lvdm/models/samplers/ddim.py", line 223, in p_sample_ddim
e_t_cond = self.model.apply_model(x, t, c, **kwargs)
File "/src/lvdm/models/ddpm3d.py", line 733, in apply_model
x_recon = self.model(x_noisy, t, **cond, **kwargs)
File "/root/.pyenv/versions/3.10.15/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl
return forward_call(*input, **kwargs)
File "/src/lvdm/models/ddpm3d.py", line 1441, in forward
xc = torch.cat([x] + c_concat, dim=1)
RuntimeError: Sizes of tensors must match except in dimension 1. Expected size 40 but got size 72 for tensor number 1 in the list.
Drexubery commented 1 month ago

Hi, this error is quite strange. The script automatically center-crops the input image and resizes it to 576x1024, so theoretically, this shouldn't cause an error. Are you using the CLI or the Gradio app? Could you paste your test image here?

sudonymously commented 1 month ago

Hi, i was able to resolve by followign the steps here https://github.com/Drexubery/ViewCrafter/issues/23#issuecomment-2371206533