williamyang1991 / VToonify

[SIGGRAPH Asia 2022] VToonify: Controllable High-Resolution Portrait Video Style Transfer
Other
3.53k stars 446 forks source link

out = torch.cat([f_G, abs(f_G-f_E)], dim=1) RuntimeError: The size of tensor a (126) must match the size of tensor b (125) at non-singleton dimension 3 #31

Closed yaohwang closed 1 year ago

yaohwang commented 1 year ago

Traceback (most recent call last): File "/VToonify/style_transfer.py", line 226, in y_tilde = vtoonify(inputs, s_w.repeat(inputs.size(0), 1, 1), d_s = args.style_degree)
File "/root/miniconda3/envs/python-app/lib/python3.9/site-packages/torch/nn/modules/module.py", line 727, in _call_impl result = self.forward(*input, *kwargs) File "/VToonify/model/vtoonify.py", line 258, in forward out, m_E = self.fusion_out[fusion_index](out, f_E, d_s) File "/root/miniconda3/envs/python-app/lib/python3.9/site-packages/torch/nn/modules/module.py", line 727, in _call_impl result = self.forward(input, **kwargs) File "/VToonify/model/vtoonify.py", line 125, in forward out = torch.cat([f_G, abs(f_G-f_E)], dim=1) RuntimeError: The size of tensor a (126) must match the size of tensor b (125) at non-singleton dimension 3

williamyang1991 commented 1 year ago

make sure the width and the height of your input image are divisible by 8. If still having the problem, make sure the width and the height are divisible by 16.

yaohwang commented 1 year ago

that make sense, but I'm using the example code and image. so I think it should be work.

python style_transfer.py --content ./data/038648.jpg \ --scale_image --backbone toonify \ --ckpt ./checkpoint/vtoonify_t_arcane/vtoonify.pt \ --padding 600 600 600 600 # use large padding to avoid cropping the image

I fix it with a little change, maybe not the best way. anyway I can make a pull-request if it's ok.

williamyang1991 commented 1 year ago

It is confusing because I have the following codes to make sure the image size to be divisible by 8. And the example code has no issue from my side.

https://github.com/williamyang1991/VToonify/blob/cf993aac7943b74ade4b84645edc771171be6d32/util.py#L184-L187

yaohwang commented 1 year ago

yeah, that's sure, so I've been confused too.

it's been ok to run the following example code and image

python style_transfer.py --content ./data/038648.jpg \ --scale_image --style_id 77 --style_degree 0.5 \ --ckpt ./checkpoint/vtoonify_d_arcane/vtoonify_s_d.pt \ --padding 600 600 600 600 # use large padding to avoid cropping the image

but with the upper one, just don't work as expect.

I'll dig it deep later, and find why.

yaohwang commented 1 year ago

I made a mistake, which is actually running

python style_transfer.py

with the default params, image ./data/077436.jpg; but I think it's better to make it applicable to this kind of situation, 'image size to be divisible by 8' is really a hard limit.