Closed chiyuzhang94 closed 4 years ago
Hi, you're right and this is intentional. The reason behind this is that I want to make sure the label of [PAD]
is 0 in the sequence labeling scenario so that various metric calculation of t2t could work. Is it causing problems?
@JayYip Thanks for your reply. I am thinking whether this affects the model performance some how.
I don't think it will. Actually, in daily work, sometimes we cannot set a clear number of classes(I know it's weird, but that's business). We will set a large number and it's proved that the performance is quite similar. I think the situation is somewhat similar here.
It actually make sense because classes are independent in the classification layer and the redundant classes will not create any loss.
Sure. Thanks for your explanations. I will change this personally.
I find a bug of generating the label set of classification task.
The label set always has one more label: [PAD]. I find in utils.fit() (line 33 and 35), the [PAD] will be add to label set. Could you please check this issue?
https://github.com/JayYip/bert-multitask-learning/blob/9fe97739194f801e539efbadbaaf97a9c945eaaa/bert_multitask_learning/utils.py#L33