dmis-lab / biobert

Bioinformatics'2020: BioBERT: a pre-trained biomedical language representation model for biomedical text mining
http://doi.org/10.1093/bioinformatics/btz682
Other
1.93k stars 451 forks source link

Provide QA BioASQ previous version #158

Open abdallah197 opened 3 years ago

abdallah197 commented 3 years ago

Hi BioBERT team

would it be possible to share the datasets provided previously before the new preprocessing? Previously I could run the QA script on the evaluation files directly but now I keep running into an error

  File "/home/abashir/anaconda3/envs/mpi/lib/python3.7/site-packages/transformers/data/processors/squad.py", line 670, in _create_examples
    answers = qa["answers"]
KeyError: 'answers'

I updated the datasets because whenever I try to run the biocodes script on the predictions I run into an assertion error

assert len(multiQid) == (24+4) # Please use the lateset version of QA datasets. All multiQids should have length of 24 + 4 (3 for Sub id)

If I commented the line. the code runs normally though, what difference will it make?

abdallah197 commented 3 years ago

@wonjininfo @jhyuklee