dmis-lab / biobert

Bioinformatics'2020: BioBERT: a pre-trained biomedical language representation model for biomedical text mining
http://doi.org/10.1093/bioinformatics/btz682
Other
1.93k stars 451 forks source link

Not able to replicate result available in readme #160

Open Shailendra77 opened 3 years ago

Shailendra77 commented 3 years ago

Using following commands for prediction

python3 run_ner.py --do_train=false --do_predict=true --do_eval=true --vocab_file=$BIOBERT_DIR/vocab.txt --bert_config_file=$BIOBERT_DIR/bert_config.json --init_checkpoint=$BIOBERT_DIR/model.ckpt-1000000 --num_train_epochs=10.0 --data_dir=$NER_DIR --output_dir=$OUTPUT_DIR

python3 biocodes/ner_detokenize.py --token_test_path=$OUTPUT_DIR/token_test.txt --label_test_path=$OUTPUT_DIR/label_test.txt --answer_path=$NER_DIR/test.tsv --output_dir=$OUTPUT_DIR

perl biocodes/conlleval.pl < $OUTPUT_DIR/NER_result_conll.txt

Also tried with fine tuning and different dataset (NCBI disease, BC4CHEMD, BC2GM) but not able to achieve same accuracy