ncats / translator-workflows

12 stars 6 forks source link

Workflow 3B: MCATS Exam Questions #54

Open karafecho opened 5 years ago

karafecho commented 5 years ago

Workflow Overview
The overall goal of this workflow is to use MCATS questions to "train" the prototype Translator System. The idea was sparked by a presentation that John Prader gave at RENCI on the development of IBM Watson (Winter 2018). For R&D of Watson Health, IBM “trained” the prototype system on USMLE questions and an army of medical students.

While USMLE questions seemed too complex for this early stage of the Translator program, MCATS questioned seem ideal. The question also meet most of the QI group’s criteria for a “good question” for testing/evaluating the prototype Translator system: ground-truth answer (plus incorrect answers that are close to the correct one [i.e., “tricky” incorrect answers]); data; infrastructure and services; relevance to SMEs; reasoning; collaboration; and googleability (not so much). The idea is to develop a sufficiently challenging workflow and stress test of the Translator system, one that can be crafted into an interesting and compelling (yet easy-to-prepare-and-digest) “TIDBIT story”.

Three sets of five MCATS questions each have been identified and grouped (roughly) into three focus areas (or modules): Molecular & Cellular Physiology; Cellular & System Anatomy; and Pathophysiology & Symptomatology. Each set contains at least one "NOT", "EXCEPT", or "LEAST LIKELY" questions, as these require a somewhat different implementation approach than the more straightforward multiple choice questions. The questions were taken directly from Khan Academy MCATS Practice Questions and were not edited for spelling, grammar, etc.

For additional background and sample questions, see notes from a January 20, 2019 mini-hackathon.

Contact The contact for Workflow 3B is Kara Fecho.

karafecho commented 5 years ago

Reminder

Email from January 31, 2019

All,

I am writing to follow up wrt the new Workflow 3B.

As you may recall, we agreed to put a hold on Workflow 3, which is based on a USMLE question, and replace it with Workflow 3B, which is based on MCATS questions. As such, I have selected three sets of five MCATS questions each, grouped (roughly) into three focus areas (or modules): Molecular & Cellular Physiology; Tissue & System Anatomy; and Pathophysiology & Symptomatology. I thought this categorization scheme might be more appropriate than Khan Academy's categorization scheme and might lend itself to the identification of missing knowledge sources and/or capabilities. I included at least one "NOT", "EXCEPT", or "LEAST LIKELY" question within each set of questions, as these require a somewhat different approach than the more straightforward multiple choice questions. Workflow 3B issues can be found here.

In terms of implementation, we could tackle this workflow in a variety of ways. For instance, each implementation team member could claim ownership of a module of their choice. Alternatively, the implementation team (all of you) could select one or more questions from each module for comparative implementation. Other possibilities exist. I will leave that decision up to you.

Please let me know if you have any questions.

Thanks,

Kara

dkoslicki commented 5 years ago

@karafecho Can we put the actual questions in the Google drive "Queries and Interface" folder and format them similar to the other workflows? It will be easier to track progress on these and break them down into sub-modules if the usual WF format is followed (such as for WF1, WF2, etc.). Currently, it looks like the questions just reside in the mini-hackathon summary.

karafecho commented 5 years ago

@dkoslicki : Actually, the questions I selected are ROUGHLY categorized into three modules and have been posted as GitHub issues: #55 (Molecular & Cellular Physiology), #56 (Tissue and System Anatomy), and #57 (Pathophysiology & Symptomatology). I chose five questions per module. Each module contains at least one "NOT", "EXCEPT", or "LEAST LIKELY" question.

If you want me to create a bid matrix for the questions, I certainly can; however, my understanding was that the bid matrices have been replaced by GitHub issues. @MarkDWilliams : perhaps you can confirm?

dkoslicki commented 5 years ago

@karafecho Ah, I see: linking those issues to this issue (like you just did), should be sufficient.