Congrats on the interesting project and good results! I was wondering if you have considered adding a comparison or finetuning BioMedLM (available from HuggingFace) on your dataset. From their blogpost, they claim a score of 50.3% in MedQA-USMLE. It's not clear how this relates to your benchmark scores, also for USMLE.
Congrats on the interesting project and good results! I was wondering if you have considered adding a comparison or finetuning BioMedLM (available from HuggingFace) on your dataset. From their blogpost, they claim a score of 50.3% in MedQA-USMLE. It's not clear how this relates to your benchmark scores, also for USMLE.