yusanshi / news-recommendation

Implementations of some methods in news recommendation.
MIT License
241 stars 50 forks source link

loss error #12

Closed chen849157649 closed 3 years ago

chen849157649 commented 3 years ago

thank you!
your code: y = torch.zeros(len(y_pred)).long().to(device) loss = criterion(y_pred, y)

I feel , loss= criterion(y_pred, y_true),

yusanshi commented 3 years ago

Thanks for your feedback.

In fact, most of the methods implemented in the repo use pairwise loss instead of pointwise loss.

With pointwise loss, the model is trained as a binary classification task, i.e., some data points have true labels and some have false ones.

With pairwise loss, however, one true label is combined with one or more false labels. And all the data points are considered as one data point. In most of the methods implemented in the repo, pairwise loss is used. Specifically, they choose to view the task as pseudo multi-class classification task. One true data point and $K$ false data points are put together and the model is expected to output the true item with the highest score. Since all true data points are put in the first place, following the false ones, the y_true is in fact 0s.

chen849157649 commented 3 years ago

Thanks for your answer. I want to use pointwise loss for train mode.
criterion = nn.BCEWithLogitsLoss() instead of criterion = nn.CrossEntropyLoss().

y = torch.zeros(len(y_pred)).long().to(device)

temp_y = [i.numpy() for i in minibatch['clicked']] y=torch.from_numpy(np.array(temp_y).T).float().to(device) loss = criterion(y_pred, y)

Load training dataset with size 16243111. Time 00:01:19, batches 100, current loss 0.6936, average loss: 0.8338, latest average loss: 0.8338 It's OK to train like this. Is there any problem?

yusanshi commented 3 years ago

Thanks for your answer. I want to use pointwise loss for train mode. criterion = nn.BCEWithLogitsLoss() instead of criterion = nn.CrossEntropyLoss().

y = torch.zeros(len(y_pred)).long().to(device)

temp_y = [i.numpy() for i in minibatch['clicked']] y=torch.from_numpy(np.array(temp_y).T).float().to(device) loss = criterion(y_pred, y)

Load training dataset with size 16243111. Time 00:01:19, batches 100, current loss 0.6936, average loss: 0.8338, latest average loss: 0.8338 It's OK to train like this. Is there any problem?

I believe you're right. But in my exps, the score is lower with pointwise loss.