facebookresearch / StyleNeRF

This is the open source implementation of the ICLR2022 paper "StyleNeRF: A Style-based 3D-Aware Generator for High-resolution Image Synthesis"
956 stars 92 forks source link

Command for resuming from checkpoint? #17

Open abhay-sheshadri opened 2 years ago

abhay-sheshadri commented 2 years ago

What is the command for resuming training from the ffhq256 checkpoint for fine tuning?

I tried:

python run_train.py outdir=training_runs data="C:\Users\abhay\Documents\datasets\new_dataset" spec=paper256 model=default resume="C:\Users\abhay\Documents\development\StyleNeRF\pretrained\ffhq_256.pkl"

This resulted in the error:

Traceback (most recent call last):
  File "run_train.py", line 380, in main
    subprocess_fn(rank=0, args=args)
  File "run_train.py", line 302, in subprocess_fn
    training_loop.training_loop(**args)
  File "C:\Users\abhay\Documents\development\StyleNeRF\training\training_loop.py", line 171, in training_loop
    D = dnnlib.util.construct_class_by_name(**D_kwargs, **common_kwargs).train().requires_grad_(False).to(device) # subclass of torch.nn.Module
  File "C:\Users\abhay\Documents\development\StyleNeRF\dnnlib\util.py", line 292, in construct_class_by_name
    return call_func_by_name(*args, func_name=class_name, **kwargs)
  File "C:\Users\abhay\Documents\development\StyleNeRF\dnnlib\util.py", line 287, in call_func_by_name
    return func_obj(*args, **kwargs)
  File "C:\Users\abhay\Documents\development\StyleNeRF\torch_utils\persistence.py", line 104, in __init__
    super().__init__(*args, **kwargs)
  File "C:\Users\abhay\Documents\development\StyleNeRF\training\networks.py", line 1303, in __init__
    block = DiscriminatorBlock(in_channels, tmp_channels, out_channels, resolution=res,
  File "C:\Users\abhay\Documents\development\StyleNeRF\torch_utils\persistence.py", line 104, in __init__
    super().__init__(*args, **kwargs)
  File "C:\Users\abhay\Documents\development\StyleNeRF\training\networks.py", line 1130, in __init__
    self.fromrgb = Conv2dLayer(img_channels, tmp_channels, kernel_size=1, activation=activation,
  File "C:\Users\abhay\Documents\development\StyleNeRF\torch_utils\persistence.py", line 104, in __init__
    super().__init__(*args, **kwargs)
  File "C:\Users\abhay\Documents\development\StyleNeRF\training\networks.py", line 200, in __init__
    weight = torch.randn(weight_shape).to(memory_format=memory_format)
TypeError: randn(): argument 'size' (position 1) must be tuple of ints, not list

The command for training from scratch works.

MultiPath commented 2 years ago

Thanks I will debug on my side and return to you by Monday

zhanglonghao1992 commented 2 years ago

@MultiPath Cannot resume training from the ffhq512 checkpoint with "python run_train.py outdir=xxx data=xxx spec=paper512 model=stylenerf_ffhq resume=ffhq512.pkl" :

Error loading: synthesis.fg_nerf.feat_out.weight torch.Size([64, 128, 1, 1]) torch.Size([256, 128, 1 , 1])

It seems that rgb_out_dim should be 64, but it is either 256 or 32 in all ffhq model config files.

KyriaAnnwyn commented 2 years ago

@MultiPath Cannot resume training from the ffhq512 checkpoint with "python run_train.py outdir=xxx data=xxx spec=paper512 model=stylenerf_ffhq resume=ffhq512.pkl" :

Error loading: synthesis.fg_nerf.feat_out.weight torch.Size([64, 128, 1, 1]) torch.Size([256, 128, 1 , 1])

It seems that rgb_out_dim should be 64, but it is either 256 or 32 in all ffhq model config files.

You can change rgb_out_dim to be 64, but there are other differences in nn architecture, I didn't figure out how to modify net to satisfy pretrained ffhq512 net

justinpinkney commented 2 years ago

By adding the recommendation in this issue: https://github.com/facebookresearch/StyleNeRF/issues/23

I get this command to at least load the pre-trained ffhq512.pkl weights

python run_train.py outdir=train_out data=${PWD}/metfaces.zip spec=paper512 model=stylenerf_ffhq resume=${PWD}/ffhq_512.pkl model.G_kwargs.synthesis_kwargs.upsample_mode=pixelshuffle model.G_kwargs.synthesis_kwargs.rgb_out_dim=64

But there's still something wrong with the output of the pre-trained model as the first snapshot looks like this:

image
MontaEllis commented 2 years ago

I met the same error, have you solved it?

By adding the recommendation in this issue: #23

I get this command to at least load the pre-trained ffhq512.pkl weights

python run_train.py outdir=train_out data=${PWD}/metfaces.zip spec=paper512 model=stylenerf_ffhq resume=${PWD}/ffhq_512.pkl model.G_kwargs.synthesis_kwargs.upsample_mode=pixelshuffle model.G_kwargs.synthesis_kwargs.rgb_out_dim=64

But there's still something wrong with the output of the pre-trained model as the first snapshot looks like this: image

justinpinkney commented 2 years ago

@MontaEllis I gave up in the end, didn't spend any more time trying to figure it out.

SuwoongHeo commented 1 year ago

@justinpinkney Several arguments in stylenerf_ffhq.yaml are different from the checkpoint arguments. I modified the config file as follows and add additional parameters at running on terminal resolution=512. Note that I didn't change upsample_mode and rgb_out_dim.

# @package _group_
name: stylenerf_ffhq

G_kwargs:
    class_name: "training.networks.Generator"
    z_dim: 512
    w_dim: 512

    mapping_kwargs:
        num_layers: ${spec.map}

    synthesis_kwargs:
        # global settings
        num_fp16_res: ${num_fp16_res}
        channel_base: 1
        channel_max: 1024
        conv_clamp: 256
        kernel_size: 1
        architecture: skip
        upsample_mode: "nn_cat"

        z_dim_bg: 32
        z_dim: 0
        resolution_vol: 32
        resolution_start: 32
        rgb_out_dim: 256

        use_noise: False
        module_name: "training.stylenerf.NeRFSynthesisNetwork"
        no_bbox: True
        margin: 0
        magnitude_ema_beta: 0.999

        camera_kwargs:
            range_v: [1.4157963267948965, 1.7257963267948966]
            range_u: [-0.3, 0.3]
            range_radius: [1.0, 1.0]
            depth_range: [0.88, 1.12]
            fov: 12
            gaussian_camera: True
            angular_camera: True
            depth_transform:  ~
            dists_normalized: False
            ray_align_corner: True
            bg_start: 0.5

        renderer_kwargs:
            n_bg_samples: 4
            n_ray_samples: 14
            abs_sigma: False
            hierarchical: True
            no_background: False

        foreground_kwargs:
            positional_encoding: "normal"
            downscale_p_by: 1
            use_style: "StyleGAN2"
            predict_rgb: True
            use_viewdirs: False
            normalized_feat: True

        background_kwargs:
            positional_encoding: "normal"
            hidden_size: 64
            n_blocks: 4
            downscale_p_by: 1
            skips: []
            inverse_sphere: True
            use_style: "StyleGAN2"
            predict_rgb: True
            use_viewdirs: False
            normalized_feat: True

        upsampler_kwargs:
            channel_base: ${model.G_kwargs.synthesis_kwargs.channel_base}
            channel_max:  ${model.G_kwargs.synthesis_kwargs.channel_max}
            no_2d_renderer: False
            no_residual_img: True
            block_reses: ~
            shared_rgb_style: False
            upsample_type: "bilinear"

        progressive: True

        # reuglarization
        n_reg_samples: 16
        reg_full: True

D_kwargs:
    class_name: "training.stylenerf.Discriminator"
    epilogue_kwargs:
        mbstd_group_size: ${spec.mbstd}

    num_fp16_res: ${num_fp16_res}
    channel_base: ${spec.fmaps}
    channel_max: 512
    conv_clamp: 256
    architecture: skip
    progressive: ${model.G_kwargs.synthesis_kwargs.progressive}
    lowres_head: ${model.G_kwargs.synthesis_kwargs.resolution_start}
    upsample_type: "bilinear"
    resize_real_early: True

# loss kwargs
loss_kwargs:
    pl_batch_shrink: 2
    pl_decay: 0.01
    pl_weight: 2
    style_mixing_prob: 0.9
    curriculum: [500,5000]
Aria-Zhangjl commented 1 year ago

Hi @SuwoongHeo , I modified the stylenerf_ffhq.yaml as you mentioned and try to resume training from the released ffhq_512.pkl. However, I still enconter the error: Error loading: synthesis.fg_nerf.feat_out.weight torch.Size([64, 128, 1, 1]) torch.Size([256, 128, 1, 1]) and after I modifed the rgb_out_dim from 256 to 64, another error occurs: Error loading: synthesis.b64.conv0.adapter.0.weight torch.Size([128, 512, 1, 1]) torch.Size([128, 1024, 1, 1]) Do you know how to fix it? Thanks!

@justinpinkney Several arguments in stylenerf_ffhq.yaml are different from the checkpoint arguments. I modified the config file as follows and add additional parameters at running on terminal resolution=512. Note that I didn't change upsample_mode and rgb_out_dim.

# @package _group_
name: stylenerf_ffhq

G_kwargs:
    class_name: "training.networks.Generator"
    z_dim: 512
    w_dim: 512

    mapping_kwargs:
        num_layers: ${spec.map}

    synthesis_kwargs:
        # global settings
        num_fp16_res: ${num_fp16_res}
        channel_base: 1
        channel_max: 1024
        conv_clamp: 256
        kernel_size: 1
        architecture: skip
        upsample_mode: "nn_cat"

        z_dim_bg: 32
        z_dim: 0
        resolution_vol: 32
        resolution_start: 32
        rgb_out_dim: 256

        use_noise: False
        module_name: "training.stylenerf.NeRFSynthesisNetwork"
        no_bbox: True
        margin: 0
        magnitude_ema_beta: 0.999

        camera_kwargs:
            range_v: [1.4157963267948965, 1.7257963267948966]
            range_u: [-0.3, 0.3]
            range_radius: [1.0, 1.0]
            depth_range: [0.88, 1.12]
            fov: 12
            gaussian_camera: True
            angular_camera: True
            depth_transform:  ~
            dists_normalized: False
            ray_align_corner: True
            bg_start: 0.5

        renderer_kwargs:
            n_bg_samples: 4
            n_ray_samples: 14
            abs_sigma: False
            hierarchical: True
            no_background: False

        foreground_kwargs:
            positional_encoding: "normal"
            downscale_p_by: 1
            use_style: "StyleGAN2"
            predict_rgb: True
            use_viewdirs: False
            normalized_feat: True

        background_kwargs:
            positional_encoding: "normal"
            hidden_size: 64
            n_blocks: 4
            downscale_p_by: 1
            skips: []
            inverse_sphere: True
            use_style: "StyleGAN2"
            predict_rgb: True
            use_viewdirs: False
            normalized_feat: True

        upsampler_kwargs:
            channel_base: ${model.G_kwargs.synthesis_kwargs.channel_base}
            channel_max:  ${model.G_kwargs.synthesis_kwargs.channel_max}
            no_2d_renderer: False
            no_residual_img: True
            block_reses: ~
            shared_rgb_style: False
            upsample_type: "bilinear"

        progressive: True

        # reuglarization
        n_reg_samples: 16
        reg_full: True

D_kwargs:
    class_name: "training.stylenerf.Discriminator"
    epilogue_kwargs:
        mbstd_group_size: ${spec.mbstd}

    num_fp16_res: ${num_fp16_res}
    channel_base: ${spec.fmaps}
    channel_max: 512
    conv_clamp: 256
    architecture: skip
    progressive: ${model.G_kwargs.synthesis_kwargs.progressive}
    lowres_head: ${model.G_kwargs.synthesis_kwargs.resolution_start}
    upsample_type: "bilinear"
    resize_real_early: True

# loss kwargs
loss_kwargs:
    pl_batch_shrink: 2
    pl_decay: 0.01
    pl_weight: 2
    style_mixing_prob: 0.9
    curriculum: [500,5000]