mchong6 / JoJoGAN

Official PyTorch repo for JoJoGAN: One Shot Face Stylization
MIT License
1.42k stars 206 forks source link

The size of tensor a (4) must match the size of tensor b (3) at non-singleton dimension 0 #17

Closed onzie9 closed 2 years ago

onzie9 commented 2 years ago

I am confused by this error. I have uploaded my own style images and used the supplied iu.jpeg file to transform. I get this error in the last cell. I have verified that the style images are 3 channel images.

How many style images are needed? Does the format of the images matter?

mchong6 commented 2 years ago

Only one style image is needed. This issue looks like when your style image is a png with 4 channels. The align face code seems to have issues with that. Can you paste the entire error you got so I know which line is causing this? Alternatively try another style image.

onzie9 commented 2 years ago

Here is the full error. Note that img.shape from opencv indicates the image is 3 channel.

---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
<ipython-input-22-1fb6e2251d67> in <module>()
     18 for name in names:
     19     style_path = f'style_images_aligned/{strip_path_extension(name)}.png'
---> 20     style_image = transform(Image.open(style_path))
     21     style_images.append(style_image)
     22 

3 frames
/usr/local/lib/python3.7/dist-pack
![Screen Shot 2022-01-10 at 5 00 21 PM](https://user-images.githubusercontent.com/10655590/148787707-921bdd2b-edbc-49a7-92d3-050402671a78.png)
ages/torchvision/transforms/functional.py in normalize(tensor, mean, std, inplace)
    349     if std.ndim == 1:
    350         std = std.view(-1, 1, 1)
--> 351     tensor.sub_(mean).div_(std)
    352     return tensor
    353 

RuntimeError: The size of tensor a (4) must match the size of tensor b (3) at non-singleton dimension 0
onzie9 commented 2 years ago

Screen Shot 2022-01-10 at 5 00 21 PM

mchong6 commented 2 years ago

Try doing this

style_image = transform(Image.open(style_path).convert("RGB")

I actually am not sure why this is happening, I kind of assume that if the align face function worked, the resulting PNG image will always be 3 channels. Maybe that assumption is wrong.

onzie9 commented 2 years ago

That did it! Thanks.