shibing624 / text2vec

text2vec, text to vector. 文本向量表征工具,把文本转化为向量矩阵,实现了Word2Vec、RankBM25、Sentence-BERT、CoSENT等文本表征、文本相似度计算模型,开箱即用。
https://pypi.org/project/text2vec/
Apache License 2.0
4.39k stars 392 forks source link

loss function #152

Open riyajatar37003 opened 3 months ago

riyajatar37003 commented 3 months ago

i am trying to understand the loss function :

def calc_loss(self, y_true, y_pred): """ 矩阵计算batch内的cos loss """ y_true = y_true[::2] norms = (y_pred ** 2).sum(axis=1, keepdims=True) ** 0.5 y_pred = y_pred / norms y_pred = torch.sum(y_pred[::2] * y_pred[1::2], dim=1) * 20 y_pred = y_pred[:, None] - y_pred[None, :] y_true = y_true[:, None] < y_true[None, :] y_true = y_true.float() y_pred = y_pred - (1 - y_true) * 1e12 y_pred = y_pred.view(-1) y_pred = torch.cat((torch.tensor([0]).float().to(self.device), y_pred), dim=0) return torch.logsumexp(y_pred, dim=0)

  1. why we are taking alternate values from true labels?
  2. why we are taking dot product between alternate ypred?

if possible can you share any link or documentation of paper for this. thanks

shibing624 commented 3 months ago

https://spaces.ac.cn/archives/8847

riyajatar37003 commented 3 months ago

thanks , but is there any english version?

riyajatar37003 commented 3 months ago

https://kexue.fm/archives/8847

in this article , what is mean by positive sample pairs and negative sample pairs? does it mean as follow: positve sample pair: (sent1, sent2,1) negative sample pair: {sent1, sent3,0) something like that

shibing624 commented 3 months ago

yes