Closed anonymoususerr1 closed 1 year ago
I used this notebook. JSON files were created with eval_usmle.py
.
Using the notebook you shared, I always get nan as the scores for every model including your pretrained models such as medalpaca-lora-7b-8bit. Do you have any idea why this happens?
Does the model fail on some scores and produces nan? I believe if one nan is in the list, numpy will return nan for all.
Hi, can you share the script to compute the scores after evaluating each model on USMLE data via eval_usmle.py?