Searching - What are the questions a researcher would like to ask?

davidmiller commented 10 years ago

Frist, let us reference the useful input from openhealthcare/opal#114

Last:

Please add questions that researchers may want to ask the search system so that we can then translate these into user stories/concrete implementation tasks.

michaeledwardmarks commented 10 years ago

Examples from openhealthcare/opal#114:

1 - Show me all patients who had community acquired pneumonia as an unchanged diagnosis 2 - show me all the patients where diagnosis was at any point PUO - and then show me what their final list of diagnoses were 3 - show me everyone where an HIV test was positive between 2014-2015 4 - show me everyone who was at somepoint tagged ID 5 - what percentage of blood cultures was an organism entered for 6 - show me the patients who went to nigeria and whose reason for travel was VFR 7 - what antibiotics did we give people with pyelonephritis

@maggiearmstrong is the queen of database searching on the current system. Suggest we ask her to outline some common requests she gets for searching the existing database - and would be helpful to know both requests that we can and can't currently achieve with the existing database so we have a benchmark.

GabPoll commented 10 years ago

I'll add to that another example. "What antibiotics did we treat a patient with an infection caused by organism X, resistant to antibiotic Y"? - a micro slanted question, but important in era of increasing antibiotic resistance. Essentially, a way to search for subfields within the "Investigations" modal.

Happy to take advice from @maggiearmstrong though.

davidmiller commented 10 years ago

Folks - these are super useful :) Do keep adding to the list if/when you think of 'em.

michaeledwardmarks commented 10 years ago

A few more questions.

These are less research questions more clinical audit/service delivery questions but important. Obviously some of the crunching would be done outside of the database but gives a sense of the kind of data we would want.

1) Of people where the clinical advice modal was completed (with any data): What was the reason for interaction What were the diagnoses What were the bugs Which of the auditable outcomes (checkboxes) were completed

2) Of people tagged ID-Liaison what was the total/average number of times clinical advice was documented. Over what time span Did Microbiologists also give advice (e.g also tagged to Micro and a microbiologist completed a clinical advice modal) Equally this question in reverse e.g of people tagged micro how often did ID-Liaison see/give advice

3) How many people had an indicator condition for HIV (e.g one of a list of diagnoses) Of these how many actually were offered an HIV test?

michaeledwardmarks commented 10 years ago

Some suggestions from Maddy. Again envisage that most of the number crunching would be done after extraction but gives a feel for what is of interest:

1- What is the variance in duration of antibiotic therapy and length of stay in specific diagnoses (say pyelonephritis or pneumonia) 2- What are the most common diagnoses in the ID liaison service 3- What is the age distribution of patients in each firm 4- What was the length of stay for IVDU patients

michaeledwardmarks commented 10 years ago

maddy made a suggestions so put here as a question related to this. Would we be able to automate some of the common audit type queries to run on a monthly basis - to effectively generate a monthly activity report e.g In the last month, for patients: Tagged ID-Inpatient Give the: Number of patients Age and sex distribution Diagnosis Length of stay

Alternative would be to have these searches defined and then for someone (? @maggiearmstrong ) to run the search each month.

Can imagine each team might have some key indicators that this would be helpful for as the basis for realtime audit.

GabPoll commented 10 years ago

A few other suggestions:

of all the patients tagged "micro-haem" in last x months, how many of them are prescribed drugs "a" or "b"? (really targeted at caspofungin use, in case you're interested @michaeledwardmarks )
of all the patients prescribed drug "a", what was their underlying diagnosis, or had a CT scan (with date extracted to compare to date of drug starting) (again looking at our predictor strategy for starting antifungals - no gold standard at the moment
extract patients who had a set of investigations C idff antigen positive, toxin negative, and then subsequently C diff antigen positive and toxin positive. Would then want to pull out their a) diagnosis, b) PMHX, c) antimicrobials - again looking at risk factors for people who convert to toxin positive - once again evidence-light zone.

davidmiller commented 10 years ago

7 - what antibiotics did we give people with pyelonephritis

antibiotics pyelonephritis

Data http://datapipes.okfnlabs.org/csv/html?url=https://gist.githubusercontent.com/davidmiller/9255218/raw/cdf1040780af85542a20a4ec7580d8828b9b29f9/antimicrobials.csv

GabPoll commented 10 years ago

Ha! Nice to see some data output. Obviously it's a very crude look, but it demonstrates a few important things:

importance of duration of data by any user post interrogation
importance of looking at dates where possible. Patients can receive several courses of antibiotics for different diagnoses
again future possibilities of linking modals during data entry - e.g. When writing pyelonephritis, system prompts antimicrobial,used, or alternatively if write an antimicrobial, system asks for what diagnosis (offering options amongst those already entered, or a new one to enter). All conceptual, i know.

GabPoll commented 10 years ago

On my phone, so can't edit the entry. First point is curation, not duration!

michaeledwardmarks commented 10 years ago

Firstly - @davidmiller this is fantastic - really shows the power of the system.

Immediate thought is: 1) Answering this question demonstrates the need to cross-link data as @GabPoll mentions. To really answer this question you would need to have the list of diagnoses for each patient and then cross-reference with the antimicrobial prescribing. 2) Similar to @GabPoll other thought I wonder about an "Indication" field in the antimicrobials modal

I think we could use this dataset in the elCID review meeting on 13th March. I say this because

Pyelonephritis is common and we will therefore have quite a bit of data
We don't really know what we are doing!

If @davidmiller could generate (for same patient group: Diagnosis = Pyelonephritis) a) Diagnoses b) Investigations (specifically Blood Culture and Urine MC&S but can obviously give us everything)

That would let us test how the whole process of querying for a specific question, linking multiple sets of data and then interrogating that data works.

davidmiller commented 10 years ago

I have the zipfile containing all pseudonomised Pyelonephritis data as specified in #138 (Prior to considerations in #147 )

Can send you both copies if you want to play.

davidmiller commented 10 years ago

5 - what percentage of blood cultures was an organism entered for ?

76%

blood cultures

View Raw Data

Raw data with R script for generating plot/answer

michaeledwardmarks commented 10 years ago

Is the big column no entry? Amazing!

michaeledwardmarks commented 10 years ago

Can I get the full csv set for this too?!!!!! I'm in data ecstacy!

davidmiller commented 10 years ago

As this is fundamentally too conversational to be actionable, I move to close this. Am going to extract each use case as a user story ticket. Please do add more user focused extract tickets w. specific research q's. :)

davidmiller commented 10 years ago

2 - show me all the patients where diagnosis was at any point PUO - and then show me what their final list of diagnoses were

Not in this iteration. (Historical)

davidmiller commented 10 years ago

would be helpful to know both requests that we can and can't currently achieve with the existing database so we have a benchmark.

Did we ever figure out a list of these ? +10 would be super helpful.

davidmiller commented 10 years ago

1- What is the variance in duration of antibiotic therapy and length of stay in specific diagnoses (say pyelonephritis or pneumonia) 2- What are the most common diagnoses in the ID liaison service 3- What is the age distribution of patients in each firm 4- What was the length of stay for IVDU patients

IMHO these are all user-land answers, which we can already get if the dates weren't munged in the extract & the tags were historic.

davidmiller commented 10 years ago

of all the patients prescribed drug "a", what was their underlying diagnosis, or had a CT scan (with date extracted to compare to date of drug starting) (again looking at our predictor strategy for starting antifungals - no gold standard at the moment

@GabPoll Can you start a new issue with this? Also: I don't understand it :) can you explain what data/filter you'd need & what you'd want to do w. the data for me?

openhealthcare / elcid

Searching - What are the questions a researcher would like to ask? #96