wyhsirius / LIA

[ICLR 22, TPAMI 24] Latent Image Animator
https://wyhsirius.github.io/LIA-project/
Other
585 stars 63 forks source link

inference with output of 512x512 #6

Open skunkwerk opened 2 years ago

skunkwerk commented 2 years ago

Great work, thank you for publishing this. I was wondering how to get the high-resolution video outputs (512x512) that you include on the project page. Do I just need to set the size parameter to 512, instead of the default of 256 here?

skunkwerk commented 2 years ago

It looks like I'd need a different model checkpoint?

When I tried size of 512, I got the following error:

Traceback (most recent call last): File "run_demo.py", line 111, in demo = Demo(args) File "run_demo.py", line 65, in init self.gen.load_state_dict(weight) File "/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py", line 1605, in load_state_dict self.class.name, "\n\t".join(error_msgs))) RuntimeError: Error(s) in loading state_dict for Generator: Missing key(s) in state_dict: "enc.net_app.convs.7.conv1.0.weight", "enc.net_app.convs.7.conv1.1.bias", "enc.net_app.convs.7.conv2.0.kernel", "enc.net_app.convs.7.conv2.1.weight", "enc.net_app.convs.7.conv2.2.bias", "enc.net_app.convs.7.skip.0.kernel", "enc.net_app.convs.7.skip.1.weight", "enc.net_app.convs.8.weight", "dec.convs.12.conv.weight", "dec.convs.12.conv.blur.kernel", "dec.convs.12.conv.modulation.weight", "dec.convs.12.conv.modulation.bias", "dec.convs.12.noise.weight", "dec.convs.12.activate.bias", "dec.convs.13.conv.weight", "dec.convs.13.conv.modulation.weight", "dec.convs.13.conv.modulation.bias", "dec.convs.13.noise.weight", "dec.convs.13.activate.bias", "dec.to_rgbs.6.bias", "dec.to_rgbs.6.upsample.kernel", "dec.to_rgbs.6.conv.0.weight", "dec.to_rgbs.6.conv.1.bias", "dec.to_flows.6.bias", "dec.to_flows.6.upsample.kernel", "dec.to_flows.6.conv.weight", "dec.to_flows.6.conv.modulation.weight", "dec.to_flows.6.conv.modulation.bias". Unexpected key(s) in state_dict: "enc.net_app.convs.7.weight". size mismatch for enc.net_app.convs.0.0.weight: copying a param with shape torch.Size([64, 3, 1, 1]) from checkpoint, the shape in current model is torch.Size([32, 3, 1, 1]). size mismatch for enc.net_app.convs.0.1.bias: copying a param with shape torch.Size([1, 64, 1, 1]) from checkpoint, the shape in current model is torch.Size([1, 32, 1, 1]). size mismatch for enc.net_app.convs.1.conv1.0.weight: copying a param with shape torch.Size([64, 64, 3, 3]) from checkpoint, the shape in current model is torch.Size([32, 32, 3, 3]). size mismatch for enc.net_app.convs.1.conv1.1.bias: copying a param with shape torch.Size([1, 64, 1, 1]) from checkpoint, the shape in current model is torch.Size([1, 32, 1, 1]). size mismatch for enc.net_app.convs.1.conv2.1.weight: copying a param with shape torch.Size([128, 64, 3, 3]) from checkpoint, the shape in current model is torch.Size([64, 32, 3, 3]). size mismatch for enc.net_app.convs.1.conv2.2.bias: copying a param with shape torch.Size([1, 128, 1, 1]) from checkpoint, the shape in current model is torch.Size([1, 64, 1, 1]). size mismatch for enc.net_app.convs.1.skip.1.weight: copying a param with shape torch.Size([128, 64, 1, 1]) from checkpoint, the shape in current model is torch.Size([64, 32, 1, 1]). size mismatch for enc.net_app.convs.2.conv1.0.weight: copying a param with shape torch.Size([128, 128, 3, 3]) from checkpoint, the shape in current model is torch.Size([64, 64, 3, 3]). size mismatch for enc.net_app.convs.2.conv1.1.bias: copying a param with shape torch.Size([1, 128, 1, 1]) from checkpoint, the shape in current model is torch.Size([1, 64, 1, 1]). size mismatch for enc.net_app.convs.2.conv2.1.weight: copying a param with shape torch.Size([256, 128, 3, 3]) from checkpoint, the shape in current model is torch.Size([128, 64, 3, 3]). size mismatch for enc.net_app.convs.2.conv2.2.bias: copying a param with shape torch.Size([1, 256, 1, 1]) from checkpoint, the shape in current model is torch.Size([1, 128, 1, 1]). size mismatch for enc.net_app.convs.2.skip.1.weight: copying a param with shape torch.Size([256, 128, 1, 1]) from checkpoint, the shape in current model is torch.Size([128, 64, 1, 1]). size mismatch for enc.net_app.convs.3.conv1.0.weight: copying a param with shape torch.Size([256, 256, 3, 3]) from checkpoint, the shape in current model is torch.Size([128, 128, 3, 3]). size mismatch for enc.net_app.convs.3.conv1.1.bias: copying a param with shape torch.Size([1, 256, 1, 1]) from checkpoint, the shape in current model is torch.Size([1, 128, 1, 1]). size mismatch for enc.net_app.convs.3.conv2.1.weight: copying a param with shape torch.Size([512, 256, 3, 3]) from checkpoint, the shape in current model is torch.Size([256, 128, 3, 3]). size mismatch for enc.net_app.convs.3.conv2.2.bias: copying a param with shape torch.Size([1, 512, 1, 1]) from checkpoint, the shape in current model is torch.Size([1, 256, 1, 1]). size mismatch for enc.net_app.convs.3.skip.1.weight: copying a param with shape torch.Size([512, 256, 1, 1]) from checkpoint, the shape in current model is torch.Size([256, 128, 1, 1]). size mismatch for enc.net_app.convs.4.conv1.0.weight: copying a param with shape torch.Size([512, 512, 3, 3]) from checkpoint, the shape in current model is torch.Size([256, 256, 3, 3]). size mismatch for enc.net_app.convs.4.conv1.1.bias: copying a param with shape torch.Size([1, 512, 1, 1]) from checkpoint, the shape in current model is torch.Size([1, 256, 1, 1]). size mismatch for enc.net_app.convs.4.conv2.1.weight: copying a param with shape torch.Size([512, 512, 3, 3]) from checkpoint, the shape in current model is torch.Size([512, 256, 3, 3]). size mismatch for enc.net_app.convs.4.skip.1.weight: copying a param with shape torch.Size([512, 512, 1, 1]) from checkpoint, the shape in current model is torch.Size([512, 256, 1, 1]).

Zirconium2159 commented 2 years ago

Same problem. I have tried to train this model with output 1024*1024 from scratch, but it can only render the 1/4 left upper corner (256*256 area) after a week of training. Maybe there are some configurations to modify, apart from '--size' argument?

BbChip0103 commented 2 years ago

Same problem. I have tried to train this model with output 10241024 from scratch, but it can only render the 1/4 left upper corner (256256 area) after a week of training. Maybe there are some configurations to modify, apart from '--size' argument?

Hi, @Zirconium2159. please check #8. now your issue was solved.

Ant in order to get 512 size result, I think we need to get the checkpoint for 512 size at first. But that is not shared yet. So I'm waiting for that.

leg0m4n commented 2 years ago

Hi, @BbChip0103, I don't think there is a checkpoint for that. The datasets they used for training are all in 256x256, so either they used the dataset 512x512 that they didn't mention, or there's some trick to make the 256x256 model work on 512x512 images(i find second to be more plausible, there are no datasets with such a high res available anywhere, that I'm aware of)

Zirconium2159 commented 2 years ago

Hi, @BbChip0103, I don't think there is a checkpoint for that. The datasets they used for training are all in 256x256, so either they used the dataset 512x512 that they didn't mention, or there's some trick to make the 256x256 model work on 512x512 images(i find second to be more plausible, there are no datasets with such a high res available anywhere, that I'm aware of)

Trying to implement LIA on celebV-HQ dataset

leg0m4n commented 2 years ago

Oh, right, totally forgot about it, it was released very recently. Let us know how it goes, super curious!

leg0m4n commented 2 years ago

Hi @Zirconium2159 ! Did you have any luck with training on celebV-HQ?

jet3004 commented 1 year ago

Hi @Zirconium2159 ! Did you have any luck with training on celebV-HQ?

@Zirconium2159 Yeahhhh, any luck? Seems like a great dataset.

huangxin168 commented 10 months ago

Any update on training on celebV-HQ?