dmis-lab / biobert

Bioinformatics'2020: BioBERT: a pre-trained biomedical language representation model for biomedical text mining
http://doi.org/10.1093/bioinformatics/btz682
Other
1.93k stars 451 forks source link

Running relation extraction on custom text #117

Open mathuryash5 opened 4 years ago

mathuryash5 commented 4 years ago

How do I run relation extraction on raw sentences?

AndreasSaka commented 4 years ago

@mathuryash5 The BioBERT performs the relation classification task and not relation extraction per se. The result, in a sentence level, assumes that a sentence co-occurrence of two entities signals a positive relation. To perform that on your raw sentences you need to recreate the input, meaning that you need a test.tsv file with index, sentence, and a label between values [0,1]. You will also need train.tsv, dev.tsv.

ghost commented 3 years ago

@AndreasSaka, for raw sentences to run the BioBert RE task, can you explain why do we still need a label between values [0,1]? Can we depend on the pretrained BioBert model to run the RE task for raw sentences which have only index and sentence, but no label [0,1] specified?