[KLUE-NER] encoder head의 output dimension

KLUE-benchmark / KLUE

📖 Korean NLU Benchmark

Creative Commons Attribution Share Alike 4.0 International

554 stars 55 forks source link

Description

NER 모델 구현에서 이상한 점이 있어 질문드립니다.

KLUE 논문에서 다음과 같이 기재되어 있습니다. We linearly map each of the final hidden states from the encoder h ∈ R|x|×H into a 12-dimensional real-valued vectors, corresponding to the 12 named-entity categories. We then minimize the cross-entropy loss summed over all the tokens.

'O' entity를 제외한 12차원으로 매핑한다면 label이 'O'인 토큰에 대해서는 임의의 label로 predict해서 False-Positive 가 크게 증가할 것 같습니다.
preprocessing 과정에서 추가되는 special token (BERT의 경우 cls token이나 sep token) 에 대해서도 별도의 라벨을 줘야할텐데 어떻게 처리했는지 질문드립니다.

감사합니다.

KLUE-benchmark / KLUE

[KLUE-NER] encoder head의 output dimension #27

Description