Closed sriderya closed 3 years ago
Hi @sriderya! Indeed, the e4e encoder in its default configurations looks to maximize on perceptual quality and editability as opposed to distortion. While i am not sure what editing capabilities your stylegan has, the encoder does seem to output a perceptually plausible image. In order to obtain a more precise reconstruction you have 2 options:
train an encoder to prioritize distortion, in which case it is possible that the output image will be of less perceptual quality, but its up to you to test it out - for that you can try:
a. --encoder_type=GradualStyleEncoder
, and to not specify the following flags:
--use_w_pool --w_discriminator_lambda --progressive_start
b. An hybrid approach of e4e without limiting the deltas and still using the discriminator:
--use_w_pool --w_discriminator_lambda 0.1
without the --progressive_start
flag.
Alternatively, since you have a pretrained encoder which yields results within the W space, you could use the obtained latent code as an initialization point for an optimization process. A short optimization with low learning rate should yield a near-perfect reconstruction. This is something we experimented with and might add it in a revision, but until then I also might upload the optimization code into the repo for further use.
Best, Omer
Closing this issue for now, feel free to open it again in case of need!
Hello,
First of all, thanks for your great work. I am trying to train the encoder on chest X-ray dataset. Although the results seems good, some details are missing, especially for medical sense. As can be seen from the example below, some important details such as cables are not recovered and this case is absolutely undesirable. By the way, results may seem pretty good for you but medical experts totally disagree :)
The parameters are:
{ "batch_size": 8, "board_interval": 50, "checkpoint_path": null, "d_reg_every": 16, "dataset_type": "xray_encode", "delta_norm": 2, "delta_norm_lambda": 0.0002, "encoder_type": "Encoder4Editing", "exp_dir": "/path/to/experiment/dir", "id_lambda": 0.5, "image_interval": 100, "keep_optimizer": false, "l2_lambda": 1.0, "learning_rate": 0.0001, "lpips_lambda": 0.8, "lpips_type": "alex", "max_steps": 200000, "optim_name": "ranger", "progressive_start": 20000, "progressive_step_every": 2000, "progressive_steps": [ 0, 20000, 22000, 24000, 26000, 28000, 30000, 32000, 34000, 36000, 38000, 40000, 42000, 44000 ], "r1": 10, "resume_training_from_ckpt": null, "save_interval": null, "save_training_data": false, "start_from_latent_avg": true, "stylegan_size": 256, "stylegan_weights": "/path/to/stylegan2.pt", "sub_exp_dir": null, "test_batch_size": 4, "test_workers": 4, "train_decoder": false, "update_param_list": null, "use_w_pool": true, "val_interval": 10000, "w_discriminator_lambda": 0.1, "w_discriminator_lr": 2e-05, "w_pool_size": 50, "workers": 8 }
In order to get better inversion for this kind of dataset, which parameters should I tune? How can I improve my results?
Thanks in advance
Can you share the learning command entered in the terminal?
Hello,
First of all, thanks for your great work. I am trying to train the encoder on chest X-ray dataset. Although the results seems good, some details are missing, especially for medical sense. As can be seen from the example below, some important details such as cables are not recovered and this case is absolutely undesirable. By the way, results may seem pretty good for you but medical experts totally disagree :)
The parameters are:
{ "batch_size": 8, "board_interval": 50, "checkpoint_path": null, "d_reg_every": 16, "dataset_type": "xray_encode", "delta_norm": 2, "delta_norm_lambda": 0.0002, "encoder_type": "Encoder4Editing", "exp_dir": "/path/to/experiment/dir", "id_lambda": 0.5, "image_interval": 100, "keep_optimizer": false, "l2_lambda": 1.0, "learning_rate": 0.0001, "lpips_lambda": 0.8, "lpips_type": "alex", "max_steps": 200000, "optim_name": "ranger", "progressive_start": 20000, "progressive_step_every": 2000, "progressive_steps": [ 0, 20000, 22000, 24000, 26000, 28000, 30000, 32000, 34000, 36000, 38000, 40000, 42000, 44000 ], "r1": 10, "resume_training_from_ckpt": null, "save_interval": null, "save_training_data": false, "start_from_latent_avg": true, "stylegan_size": 256, "stylegan_weights": "/path/to/stylegan2.pt", "sub_exp_dir": null, "test_batch_size": 4, "test_workers": 4, "train_decoder": false, "update_param_list": null, "use_w_pool": true, "val_interval": 10000, "w_discriminator_lambda": 0.1, "w_discriminator_lr": 2e-05, "w_pool_size": 50, "workers": 8 }
In order to get better inversion for this kind of dataset, which parameters should I tune? How can I improve my results?
Thanks in advance