NVIDIA / vid2vid

Pytorch implementation of our method for high-resolution (e.g. 2048x1024) photorealistic video-to-video translation.
Other
8.57k stars 1.2k forks source link

Questions about fg #132

Open ygfrancois opened 4 years ago

ygfrancois commented 4 years ago

Hello, I have some questions about how to add foreground-background prior.

  1. fg is uesd with instance input, right?
  2. the instance will provide the edge of the instances, and fg_labels will decide which are fg, but I don't understand the code of compute_mask:
    def compute_mask(self, real_As, ts, te=None): # compute the mask for foreground objects
        _, _, _, h, w = real_As.size() 
        if te is None:
            te = ts + 1        
        mask_F = real_As[:, ts:te, self.opt.fg_labels[0]].clone()
        for i in range(1, len(self.opt.fg_labels)):
            mask_F = mask_F + real_As[:, ts:te, self.opt.fg_labels[i]]
        mask_F = torch.clamp(mask_F, 0, 1)
        return mask_F  

    here, it seems that "real_As" has all the foreground mask cat through the channel direction, but the "real_As" is defined in "encode_input":

        if self.opt.use_instance:  #
            inst_map = inst_map.data.cuda()            
            edge_map = Variable(self.get_edges(inst_map))            
            input_map = torch.cat([input_map, edge_map], dim=2)

I think the number of foreground channel is same as the inst image read, because:

    def get_edges(self, t):
        edge = torch.cuda.ByteTensor(t.size()).zero_()
        edge[:,:,:,:,1:] = edge[:,:,:,:,1:] | (t[:,:,:,:,1:] != t[:,:,:,:,:-1])
        edge[:,:,:,:,:-1] = edge[:,:,:,:,:-1] | (t[:,:,:,:,1:] != t[:,:,:,:,:-1])
        edge[:,:,:,1:,:] = edge[:,:,:,1:,:] | (t[:,:,:,1:,:] != t[:,:,:,:-1,:])
        edge[:,:,:,:-1,:] = edge[:,:,:,:-1,:] | (t[:,:,:,1:,:] != t[:,:,:,:-1,:])
        return edge.float() 

So I don't know how to define the fg_label, at the same time , the generator input_nc is defined as :

        netG_input_nc = input_nc * opt.n_frames_G
        if opt.use_instance:
            netG_input_nc += opt.n_frames_G  

this seems to set the instance channel number as 1, right? So I get confuse about it, could you please help with this? Thank you very much!

ygfrancois commented 4 years ago

the edges is the tensor of instance edge, in witch edge is 1, and others are 0, how could the edges give the information of different types of instance to set fg_labels?