Error in loading the pretrained model

facebookresearch / multiface

Hosts the Multiface dataset, which is a multi-view dataset of multiple identities performing a sequence of facial expressions.

Other

727 stars 50 forks source link

Error in loading the pretrained model #24

Open zhanglonghao1992 opened 1 year ago

zhanglonghao1992 commented 1 year ago

When I run the testing script by: python -m torch.distributed.launch --nproc_per_node=1 test.py --data_dir /path/to/mini_dataset/m--20180227--0000--6795937--GHS --krt_dir /path/to/mini_dataset/m--20180227--0000--6795937--GHS/KRT --framelist_test /path/to/mini_dataset/m--20180227--0000--6795937--GHS/frame_list.txt --test_segment "./mini_test_segment.json"

I got the error: RuntimeError: Error(s) in loading state_dict for DeepAppearanceVAE: size mismatch for cc.weight: copying a param with shape torch.Size([75, 3, 1, 1]) from checkpoint, the shape in current model is torch.Size([37, 3, 1, 1 ]). size mismatch for cc.bias: copying a param with shape torch.Size([75, 3, 1, 1]) from checkpoint, the shape in current model is torch.Size([37, 3, 1, 1]).

It seems like you used 76 cams for training.

vexilligera commented 1 year ago

Hi,

Sorry for the late reply. I tried

python -m torch.distributed.launch --nproc_per_node=1 test.py --data_dir dataset/m--20180227--0000--6795937--GHS --krt_dir dataset/m--20180227--0000--6795937--GHS/KRT --framelist_test dataset/m--20180227--0000--6795937--GHS/frame_list.txt --test_segment ./mini_test_segment.json --model_path pretrained_model/6795937_model.pth

and the model was correctly loaded. The link to the model can be found at pretrained_model/index.html. Could you provide the model name you loaded as well?

dafei-qin commented 1 year ago

I encountered the same issue by loading the pretrained model from the mini pretrained model you provided in the INSTALLATION.md

vexilligera commented 1 year ago

I encountered the same issue by loading the pretrained model from the mini pretrained model you provided in the INSTALLATION.md

Hi, could you try the full model for now using the command previously mentioned? It's the same size as the mini model and it appears to work fine. @cwuu Could you check if the cameras used for training the mini model are correct?

Thanks

dafei-qin commented 1 year ago

Yes I can load the full model. However the output textures look weird, like these pred_tex.

The result.txt is:

Best screen loss 0.000000, best tex loss 0.070132,  best vert loss 0.002344, screen loss 0.000000, tex loss 0.070132, vert_loss 0.002344

vexilligera commented 1 year ago

Yes I can load the full model. However the output textures look weird, like these pred_tex.

The result.txt is:
Best screen loss 0.000000, best tex loss 0.070132,  best vert loss 0.002344, screen loss 0.000000, tex loss 0.070132, vert_loss 0.002344

Hi, this is because the model is conditioned on viewpoint. If the face part is occluded during training there's no supervision on that texture area, leading to these bright-light artifacts, which is expected.

revanj commented 1 year ago

Hi there! I was running the pretrained models and had something similar. Specifically, these model dimension mismatches happen for identities 002643814, 7889059, 5372021, 2183941, and 002914589. I hand wrote all the camera configs to include all cameras that exists in their respective folders, but for these entities, the pretrained models seems to ask for a large chunk of extra cameras that the dataset didn't provide. (e.g. given 40 cameras but dataset requires 76, etc). Identities 6795937 and 8870559 were able to run correctly. Was I accidentally using a wrong network architecture? Anything I should check?

Thanks

cwuu commented 1 year ago

Hi @revanj ,

We just update the codebase to include different camera configs for each identities and their pretrained-models with different architectures (w/o screen loss). Please let me know if it still doesn't work for your case. Thanks.

avinabsaha commented 1 year ago

Hi @cwuu , the mini model checkpoint is still having the size mismatch issue.