royorel / StyleSDF

Other
533 stars 50 forks source link

Could you provide the pre-trained discriminator model? #4

Closed NIRVANALAN closed 2 years ago

NIRVANALAN commented 2 years ago

Hi, nice work and congratulation. I wonder if you could provide your pretrained discriminator model corresponding to the generator ckpt, which may facilitate finetuning downstream tasks on other dataset?

Thanks and looking forward to your response.

royorel commented 2 years ago

Thanks @NIRVANALAN!

The discriminator models are already included in the pre-trained models. Each .pt file stores a dictionary with three entires:

  1. g_ema - Contains the weights of the exponential moving average generator used for inference.
  2. g - Contains the weights of the raw generator used for training.
  3. d - Contains the weights of the discriminator.

So to load the discriminator you would do something like this:

# initialize the discriminator architecture
discriminator = Discriminator(opt.model).to(device) 

# load the checkpoint dictionary
checkpoint = torch.load(checkpoint_path) 

# copy the checkpoint weights to the model
discriminator.load_state_dict(checkpoint["d"]) 
NIRVANALAN commented 2 years ago

@royorel Hi, thanks for your quick reply. I guess the pretrained weights is for StyleGAN Discriminator right, not the volumerender_discriminator.

royorel commented 2 years ago

That depends on what model you load. The pre trained volume renderers contains the volume renderer discriminator. The full pipeline models contain the StyleGAN discriminator.

NIRVANALAN commented 2 years ago

Great, thanks for your clarification. Also looking forward to your training code release.

Have a nice day!

NIRVANALAN commented 2 years ago

Hi, another quick question: I noticed the renderer weights in full_model and vol_renderer are not identity, though very close. According to the paper, the renderer shall be fixed during the second stage right, I wonder if you have finetuned the renderer and the style mapping network in the second stage.

Attached are the weights screenshot image

royorel commented 2 years ago

There's no fine-tuning. The volume renderer weights of the full model are taken from the g_ema volume renderer model. You should compare the full_model's weights against vol_renderer['g_ema'].

NIRVANALAN commented 2 years ago

I see, it matches now. Thanks for your clarification~

FeiiYin commented 2 years ago

Hi, Thanks for your impressive work. I want to use the StyleSDF to generate the specific pose images with a given condition image. I noticed the discriminator in VolumeRenderer can predict the viewpoint values. But when I try to directly to use the predicted viewpoints to generate images, it always gives the top view. Can you provide some suggestions? Here are the shorten code and results.

score, viewpoint= discriminator(img_condition)
...
sample_cam_extrinsics, sample_focals, sample_near, sample_far, sample_locations = \
                generate_camera_params(opt.renderer_output_size, device, batch=num_viewdirs,
                                       locations=viewpoint,  # input_fov=fov,
                                       uniform=opt.camera.uniform, azim_range=opt.camera.azim,
                                       elev_range=opt.camera.elev, fov_ang=fov,
                                       dist_radius=opt.camera.dist_radius)
...
outputs = generator(z,
                                sample_cam_extrinsics,
                                sample_focals,
                                sample_near,
                                sample_far,
                                truncation=opt.truncation_ratio,
                                truncation_latent=mean_latent)

Condition Image image Generated Image image

FeiiYin commented 2 years ago

I found a bug in my code. The problem was solved when I fixed it. :)