Open rishabhjain16 opened 3 years ago
The output of the f0 predictor is 257 dim logit instead of one-hot. So, you need to use cross-entropy loss as indicated in the paper.
Thank you for your quick response.I understand what you are saying. I found that in the appendix of paper. What I meant to ask are the 2 values you are using to calculate the loss. How are you getting the value of f0_orig in 257 dim to feed into the loss function.
Loss function requires 2 values. One is f0_pred which is the output of F0_converter model. What is the other value?
What I am asking is the input for the cross entropy loss?
The target is the quantized the ground truth f0, based on https://arxiv.org/abs/2004.07370
Thanks for your help. Paper covered most of my doubts. Great read.
In the 'Train the generator' section of solver.py:
self.G = self.G.train()
self.P = self.P.train()
# G Identity mapping loss
x_f0 = torch.cat((x_real_org, f0_org), dim=-1)
x_f0_intrp = self.Interp(x_f0, len_org)
f0_org_intrp = quantize_f0_torch(x_f0_intrp[:,:,-1])[0]
x_f0_intrp_org = torch.cat((x_f0_intrp[:,:,:-1], f0_org_intrp), dim=-1)
# G forward
x_pred = self.G(x_f0_intrp_org, x_real_org, emb_org)
g_loss_id = F.mse_loss(x_real_org, x_pred, reduction='mean')
# Preprocess f0_trg for P
x_f0_trg = torch.cat((x_real_trg, f0_trg), dim=-1)
x_f0_intrp_trg = self.Interp(x_f0_trg, len_trg)
# Target for P
f0_trg_intrp = quantize_f0_torch(x_f0_intrp_trg[:,:,-1])[0]
# P forward
f0_pred = self.P(x_real_org,f0_trg_intrp)
f0_trg_intrp_indx = f0_trg_intrp.transpose(1,2).argmax(2)
p_loss_id = F.cross_entropy(f0_pred,f0_trg_intrp_indx, reduction='mean')
# Backward and optimize.
g_loss = g_loss_id
p_loss = p_loss_id
self.reset_grad()
g_loss.backward()
p_loss.backward()
self.g_optimizer.step()
self.p_optimizer.step()
# Logging.
loss = {}
loss['G/loss_id'] = g_loss_id.item()
loss['P/loss_id'] = p_loss_id.item()
This appears to be working for me (ie seems to run at least!)
In the 'Train the generator' section of solver.py:
self.G = self.G.train() self.P = self.P.train() # G Identity mapping loss x_f0 = torch.cat((x_real_org, f0_org), dim=-1) x_f0_intrp = self.Interp(x_f0, len_org) f0_org_intrp = quantize_f0_torch(x_f0_intrp[:,:,-1])[0] x_f0_intrp_org = torch.cat((x_f0_intrp[:,:,:-1], f0_org_intrp), dim=-1) # G forward x_pred = self.G(x_f0_intrp_org, x_real_org, emb_org) g_loss_id = F.mse_loss(x_real_org, x_pred, reduction='mean') # Preprocess f0_trg for P x_f0_trg = torch.cat((x_real_trg, f0_trg), dim=-1) x_f0_intrp_trg = self.Interp(x_f0_trg, len_trg) # Target for P f0_trg_intrp = quantize_f0_torch(x_f0_intrp_trg[:,:,-1])[0] # P forward f0_pred = self.P(x_real_org,f0_trg_intrp) f0_trg_intrp_indx = f0_trg_intrp.transpose(1,2).argmax(2) p_loss_id = F.cross_entropy(f0_pred,f0_trg_intrp_indx, reduction='mean') # Backward and optimize. g_loss = g_loss_id p_loss = p_loss_id self.reset_grad() g_loss.backward() p_loss.backward() self.g_optimizer.step() self.p_optimizer.step() # Logging. loss = {} loss['G/loss_id'] = g_loss_id.item() loss['P/loss_id'] = p_loss_id.item()
This appears to be working for me (ie seems to run at least!)
Hello, I want to now where the x_real_trg come from...
I've changed some of the code around since, but hopefully this helps a bit. Both 'org' and 'trg' are just different instances. I had just tried applying some of the code from elsewhere in the repo to training so used these naming conventions. You can see here I've used the same instances to train both models:
x_real_org, emb_org, f0_org, len_org = next(data_iter)
# applies .to(self.device) to each:
x_real_org, emb_org, len_org, f0_org = self.data_to_device([x_real_org, emb_org, len_org, f0_org])
# combines spect and f0s
x_f0 = torch.cat((x_real_org, f0_org), dim=-1)
# Random resampling with linear interpolation
x_f0_intrp = self.Interp(x_f0, len_org)
# strips f0 from trimmed to quantize it
f0_org_intrp = quantize_f0_torch(x_f0_intrp[:,:,-1])[0]
self.G = self.G.train()
# combines quantized f0 back with spect
x_f0_intrp_org = torch.cat((x_f0_intrp[:,:,:-1], f0_org_intrp), dim=-1)
# G forward
x_pred = self.G(x_f0_intrp_org, x_real_org, emb_org)
g_loss_id = F.mse_loss(x_pred, x_real_org, reduction='mean')
# Backward and optimize.
self.g_optimizer.zero_grad()
g_loss_id.backward()
self.g_optimizer.step()
loss['G/loss_id'] = g_loss_id.item()
# =================================================================================== #
# 3. F0_Converter Training #
# =================================================================================== #
self.P = self.P.train()
f0_trg_intrp_indx = f0_org_intrp.argmax(2)
# P forward
f0_pred = self.P(x_real_org,f0_org_intrp)
p_loss_id = F.cross_entropy(f0_pred.transpose(1,2),f0_trg_intrp_indx, reduction='mean')
self.p_optimizer.zero_grad()
p_loss_id.backward()
self.p_optimizer.step()
loss['P/loss_id'] = p_loss_id.item()
I am trying to replicate your work. I am currently making F0 converter model for P checkpoint generation. I am stuck at loss calculation.
I see when I use F0_Converter model to generate P, I get a 257 dimension one-hot encoded feature P.
Demo.ipynb
I wanted to ask you when training the F0 converter model, what is the value that you are using to calculate the loss?
I tried using the following value but I am not sure if that is the right way. This is what I am doing to generate f0_pred and to calculate the loss:
I just want to know if I am on the right track. Can you help me out here @auspicious3000