Why code below, in the project, can be used as "loss".
oloss = t.bmm(ovectors, ivectors).squeeze().sigmoid().log().mean(1) nloss = t.bmm(nvectors, ivectors).squeeze().sigmoid().log().view(-1, context_size, self.n_negs).sum(2).mean(1)
In my judgment, "loss" should be "prediction" - "actual result".
But, in the upper code, "oloss" is prediction, without operation on actual result.
I thought of loss as some quantity that needs to be optimized. From this perspective, the goal of the given code is to maximize the difference in likelihood between positive and negative cases.
Why code below, in the project, can be used as "loss".
oloss = t.bmm(ovectors, ivectors).squeeze().sigmoid().log().mean(1) nloss = t.bmm(nvectors, ivectors).squeeze().sigmoid().log().view(-1, context_size, self.n_negs).sum(2).mean(1)
In my judgment, "loss" should be "prediction" - "actual result". But, in the upper code, "oloss" is prediction, without operation on actual result.