Open atakanokan opened 5 years ago
@atakanokan Yes. It's possible to have custom entity labels. It's like a muti-class classification problem, the model can handle any labels exist in the training data. If you have your dataset in a standard conll format like https://github.com/pritishuplavikar/Named-Entity-Recognition/blob/master/wikigold.conll.txt, you can use https://github.com/microsoft/nlp/blob/master/utils_nlp/dataset/ner_utils.py to preprocess your dataset as shown in wikigold.py
@atakanokan When creating the tags for your entities, please make sure they follow https://en.wikipedia.org/wiki/Inside%E2%80%93outside%E2%80%93beginning_(tagging)
Do all Token Classification scenarios require the input data to be in the form of CONll? I want to use this for a 3 tag multi label classification over custom sentences, where the tags are mapped to chunks of tokens which together form a semantic representation, and not just a single token.
@Kc2fresh have you been able to solve your task? (if I understand that correct, the label prediction for multi-token mentions)?
Description
Is it possible to finetune BERT NER on custom entity labels other than what is shown in https://github.com/microsoft/nlp/blob/master/examples/named_entity_recognition/ner_wikigold_bert.ipynb (Cell 4) :
Other Comments
It seems possible but wanted to make sure. Procedure: