StelaBou / stylegan_directions_face_reenactment

Authors official PyTorch implementation of the "Finding Directions in GAN’s Latent Space for Neural Face Reenactment" [BMVC 2022].
50 stars 5 forks source link

Missing key(s) in state_dict in the run_facial_editing.py file #4

Closed dedkamaroz closed 1 year ago

dedkamaroz commented 1 year ago

COMMAND: python3 run_facial_editing.py --source_path ./images/selfie4.jpg --output_path ./output/facial_editing --directions 0 1 2 3 4 --image_resolution 1024 --dataset_type ffhq --save_images --optimize_generator

OUTPUT: creating the FLAME Decoder trained model found. Load /home/ubuntu/stylegan_directions_face_reenactment/libs/DECA/data/deca_model.tar ----- Load generator from ./pretrained_models/stylegan2-ffhq-config-f_1024.pt ----- ----- Load A matrix from ./pretrained_models/A_matrix.pt ----- Linear Direction matrix-A in w+ space: input dimension 15, output dimension 512, shift dimension 512 ----- Load e4e encoder from ./pretrained_models/e4e-voxceleb.pt ----- Traceback (most recent call last): File "run_facial_editing.py", line 315, in inference.run_editing() File "run_facial_editing.py", line 217, in run_editing self.load_models(inversion) File "run_facial_editing.py", line 98, in load_models self.encoder.load_state_dict(ckpt['e']) File "/home/ubuntu/miniconda3/envs/python38/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1051, in load_state_dict raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format( RuntimeError: Error(s) in loading state_dict for Encoder4Editing: Missing key(s) in state_dict: "styles.14.convs.0.weight", "styles.14.convs.0.bias", "styles.14.convs.2.weight", "styles.14.convs.2.bias", "styles.14.convs.4.weight", "styles.14.convs.4.bias", "styles.14.convs.6.weight", "styles.14.convs.6.bias", "styles.14.convs.8.weight", "styles.14.convs.8.bias", "styles.14.convs.10.weight", "styles.14.convs.10.bias", "styles.14.linear.weight", "styles.14.linear.bias", "styles.15.convs.0.weight", "styles.15.convs.0.bias", "styles.15.convs.2.weight", "styles.15.convs.2.bias", "styles.15.convs.4.weight", "styles.15.convs.4.bias", "styles.15.convs.6.weight", "styles.15.convs.6.bias", "styles.15.convs.8.weight", "styles.15.convs.8.bias", "styles.15.convs.10.weight", "styles.15.convs.10.bias", "styles.15.linear.weight", "styles.15.linear.bias", "styles.16.convs.0.weight", "styles.16.convs.0.bias", "styles.16.convs.2.weight", "styles.16.convs.2.bias", "styles.16.convs.4.weight", "styles.16.convs.4.bias", "styles.16.convs.6.weight", "styles.16.convs.6.bias", "styles.16.convs.8.weight", "styles.16.convs.8.bias", "styles.16.convs.10.weight", "styles.16.convs.10.bias", "styles.16.linear.weight", "styles.16.linear.bias", "styles.17.convs.0.weight", "styles.17.convs.0.bias", "styles.17.convs.2.weight", "styles.17.convs.2.bias", "styles.17.convs.4.weight", "styles.17.convs.4.bias", "styles.17.convs.6.weight", "styles.17.convs.6.bias", "styles.17.convs.8.weight", "styles.17.convs.8.bias", "styles.17.convs.10.weight", "styles.17.convs.10.bias", "styles.17.linear.weight", "styles.17.linear.bias".

ENVIRONMENT: Lambdalabs cloud gpu server Ubuntu 20.04.1 Nvidia A10 24GB VRAM Python 3.8.0

Thank you

StelaBou commented 1 year ago

This is because you are trying to use the FFHQ dataset (--image_resolution 1024 --dataset_type ffhq). The models provided (stylegan2 generator, e4e encoder and A_matrix) are trained on the VoxCeleb dataset with 256 image resolution.