Closed Ulitochka closed 5 years ago
We can extract original labels by replace first token label to B-PER. For example: I-O I-O I-PER I-PER I-O -> I-O I-O B-PER I-PER I-O
If we have 2 entities? I-O I-O I-PER I-PER I-PER I-PER I-O -> I-O I-O B-PER I-PER I-PER B-PER I-O
I is fail for now, but this situation is very rare. You can return BIO markup.
This Fix BIO to IO as BERT proposed https://arxiv.org/pdf/1810.04805.pdf - increases quality?
Ok, thanks!
Hello.
In your example (https://github.com/sberbank-ai/ner-bert/blob/master/examples/factrueval-nmt.ipynb) you are using bio markups. But in code (bert_data.py (187)):
you use io. how do you get the original markup after training?