chilynn / sequence-labeling

307 stars 167 forks source link

关于模型中CRF层的几个小疑问 #28

Open Ethan1214 opened 6 years ago

Ethan1214 commented 6 years ago

你好: 对于实现过程中部分代码有一点疑问,希望您可以为我解答: 在实现CRF的batch 操作的时候, 是否把一个batch中的句子拼成了一个句子看待?

        self.point_score = tf.gather(tf.reshape(self.tags_scores, [-1]), tf.range(0, self.batch_size * self.num_steps) * self.num_classes + tf.reshape(self.targets,[self.batch_size * self.num_steps]))

         还有我改写了部分代码,现在能在多类别标签数据集上跑通,但是loss会出现负数, 请问是和crf层中的常量设置有关系吗?
         但是按照 公式推导,p(y|X)是个小数,套一个log为负,再取反作为loss应该是正数才对。
tomsonsgs commented 6 years ago

i solve the problem by change the code if condition "i==0" by "j==0" in getTransition(y_train_batch) function,and loss come to 0 as expected,you all can check on that,and will this improve the final accuracy?someone can try that

Ethan1214 commented 6 years ago

Hi,@tomsonsgs : I have tried your method,and it works and make loss above 0, but the accuracy didn't improve obviously. I don't understand why did your make it break when j==0. In my opinion,we could ignore the transition-score that the last word of one sentence to the ending_tag("") if breaking when j==0. For example, we have a true label_sequence when training: B M E O O B E ........ If breaking when j==0, we ignore the transition-score of "E to ". Although my loss came under 0, I think the primary method is right .

Can you tell me why did you make it break when j==0??

tomsonsgs commented 6 years ago


tomsonsgs commented 6 years ago


Ethan1214 commented 6 years ago

@tomsonsgs 我看他在前向计算总路径得分的时候,对transition的运用并没有看出哪边对最后的转义得分做了省略,能否说明一下具体是哪几步操作呢?


last_alphas = tf.gather(alphas, tf.range(0, self.batch_size) * (self.num_steps + 2) + length) 将length改成length+1是否可行, 这样i==0应该就不用改了??!!

fxh0919 commented 6 years ago


tomsonsgs commented 6 years ago

@fxh0919 一个极小值,表示某节点取到该类别得分极小,可以-2000等等,因为开始的话必然在起始状态,其他类别的可能性为0,但在log后一般取极小表示概率接近于0