spanBERT for Named Entity Recognition

facebookresearch / SpanBERT

Code for using and evaluating SpanBERT.

Other

881 stars 175 forks source link

spanBERT for Named Entity Recognition #74

Open ghost opened 2 years ago

ghost commented 2 years ago

Hi,

I am looking to train a NER model with custom entities using a spanBERT because some of the entity texts spans over 3-4 tokens due to which NER using BERT token classification is failing to correctly classify them. I looked around the git repo and some articles related to spanBERT and TACRED dataset, I found that TACRED does involve NER Entities in the dataset but did not foudn anything related to whether a spanBERT can be used for NER. Any help in how to use spanBERT for NER (if it can be used) will be appreciated.

Thank you!

mandarjoshi90 commented 2 years ago

Not exactly sure what your data looks like and how well BERT is doing, but 3-4 tokens doesn't seem incredibly large. You'll likely see a gain going from BERT to SpanBERT (to RoBERTa and other larger models) since most tasks follow the same trend. But if the numbers are super low to begin with, my best guess is that there are issues other than just span length.

You could still try out SpanBERT and RoBERTa. Folks at Huggingface will likely have some code you could use. Good luck!

ghost commented 2 years ago

Thanks, @mandarjoshi90 Scores are not too low for me but sometimes when entity length is, say 4 tokens, BERT misclassifies the mid tokens and the entire result is compromised because of that. I have tagged data in BIO format. I looked into HuggingFace but no proper information on how to train spanBERT on a custom dataset for NER is given. Also, moving from bert-base to bert-large and roberta-base and roberta-large made the performance plunge. Can you provide some info on how to train spanBERT for NER? Or any info for training LUKE for SpanClassification on custom dataset will also help.

Thanks

kotalaraghava commented 2 years ago

@ghost @mandarjoshi90 Any update on this I am also looking to model NER classification as spanclassification ?