salrowili / BioM-Transformers

BioM-Transformers: Building Large Biomedical Language Models with BERT, ALBERT and ELECTRA
Apache License 2.0
34 stars 6 forks source link

BioM-Transformers: Building Large Biomedical Language Models with BERT, ALBERT and ELECTRA.

Paper link

Abstract

The impact of design choices on the performance of biomedical language models recently has been a subject for investigation. In this paper, we empirically study biomedical domain adaptation with large transformer models using different design choices. We evaluate the performance of our pretrained models against other existing biomedical language models in the literature. Our results show that we achieve state-of-the-art results on several biomedical domain tasks despite using similar or less computational cost compared to other models in the literature. Our findings highlight the significant effect of design choices on improving the performance of biomedical language models.

Pre-Trained Models ( PyTorch )

Pre-Trained LM Models ( TensorFlow )

Model Corpus Vocab Batch Size Training Steps Link
BioM-ELECTRA-Base PubMed Abstracts 29K PubMed 1024 500K link
BioM-ELECTRA-Large PubMed Abstracts 29K PubMed 4096 434K link
BioM-BERT-Large PubMed Abstracts + PMC 30K EN Wiki + Books Corpus 4096 690K link
BioM-ALBERT-xxlarge PubMed Abstracts 30K PubMed 8192 264k link
BioM-ALBERT-xxlarge-PMC PubMed Abstracts + PMC 30K PubMed 8192 +64k link

SQuAD Fine-Tuned Checkpoints ( TensorFlow )

Model Exact Match (EM) F1 Score Link
BioM-ELECTRA-Base-SQuAD2 81.35 84.20 Link
BioM-ELECTRA-Large-SQuAD2 85.48 88.27 Link
BioM-ELECTRA-Large-MNLI-SQuAD2 85.24 88.01 Link
BioM-ALBERT-xxlarge-SQuAD2 83.86 86.99 Link
BioM-ALBERT-xxlarge-MNLI-SQuAD2 84.35 87.31 Link

We implement transferability between MNLI and SQuAD, which was explained in details by (Jeong, et al., 2020). We detailed our particpiation in BioASQ9B in this Paper. To check the performance of our systems (UDEL-LAB) from the official BioASQ leaderboard visit http://participants-area.bioasq.org/results/9b/phaseB/ .

GluonNLP (MXNet) Checkpoints

More information about GlounNLP https://github.com/dmlc/gluon-nlp

Model Link
BioM-ELECTRA-Base Link
BioM-ELECTRA-Large Link
Model Exact Match (EM) F1 Score Link
BioM-ELECTRA-Base-SQuAD2 80.93 83.86 Link
BioM-ELECTRA-Large-SQuAD2 85.34 88.09 Link

Colab Notebook Examples

BioM-ELECTRA-LARGE on NER and ChemProt Task Open In Colab

BioM-ELECTRA-Large on SQuAD2.0 and BioASQ7B Factoid tasks Open In Colab

BioM-ALBERT-xxlarge on SQuAD2.0 and BioASQ7B Factoid tasks Open In Colab

Text Classification Task With HuggingFace Transformers and PyTorchXLA on Free TPU Open In Colab

Fine-Tunning BioM-Transformers on Question Answering dataset with TPU and Torch XLA Open In Colab

Reproducing our BLURB results with JAX Open In Colab

Finetunning BioM-Transformers with Jax/Flax on TPUv3-8 with free Kaggle resource Open In Colab

Acknowledgment

We would like to acknowledge the support we have from Tensorflow Research Cloud (TFRC) team to grant us access to TPUv3 units.

Citation

BioM-Transfomers Model

@inproceedings{alrowili-shanker-2021-biom,
title = "{B}io{M}-Transformers: Building Large Biomedical Language Models with {BERT}, {ALBERT} and {ELECTRA}",
author = "Alrowili, Sultan and
Shanker, Vijay",
booktitle = "Proceedings of the 20th Workshop on Biomedical Language Processing",
month = jun,
year = "2021",
address = "Online",
publisher = "Association for Computational Linguistics",
url = "https://www.aclweb.org/anthology/2021.bionlp-1.24",
pages = "221--227",
abstract = "The impact of design choices on the performance of biomedical language models recently has been a subject for investigation. In this paper, we empirically study biomedical domain adaptation with large transformer models using different design choices. We evaluate the performance of our pretrained models against other existing biomedical language models in the literature. Our results show that we achieve state-of-the-art results on several biomedical domain tasks despite using similar or less computational cost compared to other models in the literature. Our findings highlight the significant effect of design choices on improving the performance of biomedical language models.",
}

Question Answering with BioM-Transformers

@article{alrowili2021large,
  title={Large biomedical question answering models with ALBERT and ELECTRA},
  author={Alrowili, Sultan and Shanker, K},
  url = "http://ceur-ws.org/Vol-2936/paper-14.pdf",
  journal={CLEF (Working Notes)},
  year={2021}
}
@inproceedings{alrowili2022exploring,
  title={Exploring Biomedical Question Answering with BioM-Transformers At BioASQ10B challenge: Findings and Techniques},
  author={Alrowili, Sultan and Vijay-Shanker, K},
  year={2022},
  organization={CEUR Workshop Bologna, Italy}
}