Closed Santosh-Gupta closed 4 years ago
I built my own medical subset before the one you linked was released
https://github.com/koursaros-ai/MSMarco-bio
But I found that the biobert model trained on a large portion of the general dataset did just as well
Will you be releasing the bioBert trained on the general dataset as well?
MS Marco has a medical subset ( here https://github.com/Georgetown-IR-Lab/covid-neural-ir/blob/master/med-msmarco-train.txt )
I was wondering if the Biobert version was trained on the full msmarco dataset, or only the medical subset?