NVlabs / few-shot-vid2vid

Pytorch implementation for few-shot photorealistic video-to-video translation.
Other
1.79k stars 276 forks source link

[Pose] A problem during testing #68

Open cszy98 opened 3 years ago

cszy98 commented 3 years ago

In the "Few-shot Video-to-Video Synthesis":

Moreover, we can also (optionally) finetune the network using the given example images to improve performance.Note that we only finetune the weight generation module E and the intermediate image synthesis network H, and leave all parameters related to flow estimation fixed.

But in the code:

train_names = ['fc', 'conv_img', 'up'] 
params, _ = self.get_train_params(self.netG, train_names)

and

    def get_train_params(self, netG, train_names):

        train_list = set()
        params = []          
        params_dict = netG.state_dict()

        for key, value in params_dict.items():
            do_train = False
            for model_name in train_names:
                if model_name in key: do_train = True            
            if do_train:
                module = netG                        
                key_list = key.split('.')
                for k in key_list:
                    module = getattr(module, k)
                params += [module]
                train_list.add('.'.join(key_list[:1]))
        Visualizer.vis_print(self.opt, ('training layers: ', train_list))
        return params, train_list

During testing, I get the output:

('training layers: ', {'fc_spade_0_0', 'ref_img_up_2', 'fc_spade_e_0', 'fc_spade_1_3', 'conv_img', 'fc_spade_1_0', 'ref_img_up_4', 'label_embedding', 'fc_spade_s_0', 'fc_spade_1_1', 'ref_label_up_1', 'fc_spade_0_3', 'fc_spade_e_3', 'fc_spade_s_3', 'fc_spade_e_2', 'up_4', 'up_2', 'ref_label_up_2', 'up_1', 'ref_label_up_3', 'ref_img_up_0', 'up_0', 'fc_spade_1_2', 'fc_spade_s_1', 'fc_spade_s_2', 'ref_label_up_0', 'ref_label_up_4', 'flow_network_temp', 'fc_spade_0_2', 'flow_network_ref', 'ref_img_up_1', 'up_5', 'fc_spade_0_1', 'up_3', 'ref_img_up_3', 'fc_spade_e_1'})

In fact, the "flow_network_ref" and "flow_network_temp" are also finetuned. The result is completely collapsed.

sunyasheng commented 3 years ago

Have you solved this problem? I also met the same problem.

sunyasheng commented 3 years ago

I filter out the optical flow network and tune down my learning rate. After that, the results seem to be reasonable but still unsatisfactory.

cszy98 commented 3 years ago

I filter out the optical flow network and tune down my learning rate. After that, the results seem to be reasonable but still unsatisfactory.

I used the same method as you, and the results were not satisfactory,especially in the facial area. In addition, did you set the "--add_face_D" option? After I use this option to train for a while, the synthesised facial area will collapse.

sunyasheng commented 3 years ago

Sorry for the late replay. I didn't set the "--add_face_D" option.