liu-nlper / SLTK

序列化标注工具,基于PyTorch实现BLSTM-CNN-CRF模型,CoNLL 2003 English NER测试集F1值为91.10%(word and char feature)。
362 stars 84 forks source link

crf的loss部分疑似进行了两次batch average #11

Open Fzz123 opened 5 years ago

Fzz123 commented 5 years ago

您好,我在参看代码的时候发现,crf.py 中的 neg_log_likelihood_loss 函数里有: if self.average_batch: return (forward_score - gold_score) / batch_size return forward_score - gold_score 而在调用它的 sequence_labeling_model.py 中的 loss 函数里也有: if not self.use_crf: batch_size, max_len = feats.size(0), feats.size(1) lstm_feats = feats.view(batch_size * max_len, -1) tags = tags.view(-1) return self.loss_function(lstm_feats, tags) else: loss_value = self.loss_function(feats, mask, tags) print ('loss_value:', loss_value) if self.average_batch: batch_size = feats.size(0) loss_value /= float(batch_size) return loss_value 这样是不是就多求了一次平均呢?

carrie0307 commented 5 years ago

同问,我也发现了这里~

carrie0307 commented 5 years ago

您好,我在参看代码的时候发现,crf.py 中的 neg_log_likelihood_loss 函数里有: if self.average_batch: return (forward_score - gold_score) / batch_size return forward_score - gold_score 而在调用它的 sequence_labeling_model.py 中的 loss 函数里也有: if not self.use_crf: batch_size, max_len = feats.size(0), feats.size(1) lstm_feats = feats.view(batch_size * max_len, -1) tags = tags.view(-1) return self.loss_function(lstm_feats, tags) else: loss_value = self.loss_function(feats, mask, tags) print ('loss_value:', loss_value) if self.average_batch: batch_size = feats.size(0) loss_value /= float(batch_size) return loss_value 这样是不是就多求了一次平均呢?

我感觉也是哎