Some important notes for application: there was wide variability around which model performed best depending on the task. Counterintuitively, the base variants worked better for some tasks over the large variants. Additionally, sometimes the model that was also trained on the MIMIC dataset performed worse where they expected better performance.
Description of Changes
Adds the NCBI BERT models from https://github.com/ncbi-nlp/NCBI_BERT.
Paper: https://arxiv.org/abs/1906.05474
Some important notes for application: there was wide variability around which model performed best depending on the task. Counterintuitively, the
base
variants worked better for some tasks over thelarge
variants. Additionally, sometimes the model that was also trained on theMIMIC
dataset performed worse where they expected better performance.Summary of performance:
Related Issue(s), if any