Open AGBonnet opened 6 months ago
Added choice de-shuffling before evaluation, so that the self-consistency majority answer can be selected.
Remaining problems:
In the original MedPrompt paper, they only select KNN exemplars among those that the model already had right, so we’d need to run normal inference before running the MedPrompt inference. We can probably do away with that condition though
If we want to combine CoT and KNN few-shot, we need to have CoT data for all training samples. We can either:
(1) Get access to CoT references for the whole training set. ThoughtSource has made some work on this, might be worth looking into. In the paper they say they provide CoT for all dataset except PubMedQA and MedQA, but they’re shown in the github repo.
We created these reference CoTs by converting rationales provided by original datasets into reasoning chains.
(2) Self-generate explanations (as done in the MedPrompt paper originally)
Meditron x MedPrompt
Here's a first step to run MedPrompt on Meditron. This code is untested and your reviews are very welcome.
MedPrompt is composed of 3 steps:
NOTE: in the original paper, potential candidates for KNN exemplars are QA pairs 'aced' by the model in 0-shot For now, this is not implemented, because it requires to run inference on the dataset beforehand. However, we might be able to collect correct QA pairs from our past 0-shot evaluations runs
For reference, here is the MedPrompt algorithm: