facebookresearch / VisualVoice

Audio-Visual Speech Separation with Cross-Modal Consistency
Other
218 stars 35 forks source link

How to define the weight coefficient in mask loss? #27

Open dengyuanjie opened 1 year ago

dengyuanjie commented 1 year ago

Hello, could you please explain the meaning of the weights here?

This coefficient is not included in the paper, and I have found that it is not necessary to calculate this weight in the test.py.

     # calculate loss weighting coefficient        
     if self.opt.weighted_loss:
        weight1 = torch.log1p(torch.norm(audio_mix_spec1[:,:,:-1,:], p=2, dim=1)).unsqueeze(1).repeat(1,2,1,1)
        weight1 = torch.clamp(weight1, 1e-3, 10)
        weight2 = torch.log1p(torch.norm(audio_mix_spec2[:,:,:-1,:], p=2, dim=1)).unsqueeze(1).repeat(1,2,1,1)
        weight2 = torch.clamp(weight2, 1e-3, 10)
    else:
        weight1 = None
        weight2 = None