jind11 / MedQA

Code and data for MedQA
MIT License
197 stars 16 forks source link

About the Results in the paper. #1

Open littlefive5 opened 3 years ago

littlefive5 commented 3 years ago

I rerun the code in the IR repo. However, I just get the result 22%, which is far lower than the 34% in the paper for USMLE. Is there any other setting for the IR method?

jind11 commented 3 years ago

I just now updated the data in the github repo by adding the 4 options version of data I previously used. Let me whether you can replicate the numbers in my paper using this version of data. Thanks!

zyngielg commented 3 years ago

I tried running the solvers/textsearch.py on the 4_options/dev.jsonl:

littlefive5 commented 3 years ago

I tried running the solvers/textsearch.py on the 4_options/dev.jsonl:

  • for topn=10 the accuracy was 28.7%
  • for topn=5 the accuracy was 27.7%

Me too. I also tried TW dataset and I got about 31% when the topn is 15.

jind11 commented 3 years ago

hmmm, this code was written a year ago and I tried my best to release the old code here without running and verifying. Thanks for helping me find out the issue here. I am gonna check on my side the source of performance discrepancy but it may be finished after 5/17 which is the conference ddl I am now busy with. I am so sorry for the slow process. Thank you for your great patience.

MotzWanted commented 2 years ago

@jind11, did you get a chance to solve the issue?

And is it possible for you to release the reader model as well?

vlievin commented 2 years ago

@jind11 What is the status on reproducing the results? Are you 100% the dataset is correct? Great work by the way, that's a very nice dataset you have built here!