Closed chen849157649 closed 3 years ago
Thanks for your feedback.
In fact, most of the methods implemented in the repo use pairwise loss instead of pointwise loss.
With pointwise loss, the model is trained as a binary classification task, i.e., some data points have true labels and some have false ones.
With pairwise loss, however, one true label is combined with one or more false labels. And all the data points are considered as one data point. In most of the methods implemented in the repo, pairwise loss is used. Specifically, they choose to view the task as pseudo multi-class classification task. One true data point and $K$ false data points are put together and the model is expected to output the true item with the highest score. Since all true data points are put in the first place, following the false ones, the y_true
is in fact 0
s.
Thanks for your answer.
I want to use pointwise loss for train mode.
criterion = nn.BCEWithLogitsLoss() instead of criterion = nn.CrossEntropyLoss().
temp_y = [i.numpy() for i in minibatch['clicked']] y=torch.from_numpy(np.array(temp_y).T).float().to(device) loss = criterion(y_pred, y)
Load training dataset with size 16243111. Time 00:01:19, batches 100, current loss 0.6936, average loss: 0.8338, latest average loss: 0.8338 It's OK to train like this. Is there any problem?
Thanks for your answer. I want to use pointwise loss for train mode. criterion = nn.BCEWithLogitsLoss() instead of criterion = nn.CrossEntropyLoss().
y = torch.zeros(len(y_pred)).long().to(device)
temp_y = [i.numpy() for i in minibatch['clicked']] y=torch.from_numpy(np.array(temp_y).T).float().to(device) loss = criterion(y_pred, y)
Load training dataset with size 16243111. Time 00:01:19, batches 100, current loss 0.6936, average loss: 0.8338, latest average loss: 0.8338 It's OK to train like this. Is there any problem?
I believe you're right. But in my exps, the score is lower with pointwise loss.
thank you!
your code: y = torch.zeros(len(y_pred)).long().to(device) loss = criterion(y_pred, y)
I feel , loss= criterion(y_pred, y_true),