About relation and entity labelling in pretraining

Hi, you mention the pretrained Bert model as your requirement. But you do your own pretraining right?

The downloadable pretrained model is done with corpus that does not come with labels for relation extraction.

It sounds like you do your pretraining with text automatically given labels for entity and relation by spacy which may not be 100% correct. Am I correct in this?

For the training (I suppose this is fine-tuning), you use the Semeval dataset whose labels for entities and relations are supposed to be manually checked and 100% correct.

I suppose you have tried with the downloadable pretrained model that does not get any label for entities and relations. How much worse the performance would be when you do this?

Thanks.

plkmo / BERT-Relation-Extraction

About relation and entity labelling in pretraining #20