OHDSI / Circe

[Under development] CIRCE is a cohort definition and syntax compiler tool for OMOP CDMv5
Apache License 2.0
5 stars 4 forks source link

How to define the input of CIRCE using eligibility criteria #45

Closed Tian312 closed 8 years ago

Tian312 commented 9 years ago

Now I'm trying to parse the raw-text clinical trial eligibility criteria into a structured format. I wonder is is possible to structure the eligibility criteria into a format that can be the input of CIRCE so the free-text criteria can be used to define cohort. Some examples of the parsing results are:

Input Eligibility criteria text:

  1. Age: 18-60 years.
  2. Patients with CNS disease or testicular disease are eligible.

Output after parsing: Concept, Semantic group, Constraints age, Person, 18-60 yo CNS disease, Condition, / testicular disease, Condition, /

The problem is that some concepts in criteria are not covered in existing terminologies and can’t be mapped to a concept id. So I wonder is there anyway to structure my parsing results (e.g. a SQL format) so that they can be directly used by CIRCE?

Thanks.

chrisknoll commented 9 years ago

No, there isn't a facility to inject SQL into the middle of a cohort inclusion/restriction criteria expression. Can you give an example of the type of concept that you are trying to identify that isn't in the CDM vocabulary?

Tian312 commented 9 years ago

Thanks for the reply.

E.g., "Sedating H1 antihistamines". I used HERMES to search for it, but showed no result. Only "antihistamines" is a concept in the CDM vocabulary but the criterion defines a more specific .

Many other examples in eligibility criteria result from the same reason, the different requirements for the concept granularity. Another example is "clinically significant abnormal lab results", which occurs in eligibility criteria with high frequency, but we could only found "lab result" as a concept using HERMES. However, "lab result" alone cannot make a meaningful criteria, we need to define at least "abnormal lab result".

chrisknoll commented 9 years ago

So, on the 'abnormal result' filter, if you add a Measurement criteria, and add the 'abnormal result' filter, that will restrict the people to those with a measurement of an abnormal result. However, I just checked the generated sql, and it appears that this criteria isn't implemented! So I'd like to get right on that. But basically, the SQL it would generate is something like this:

where (m.value_as_number < m.range_low or m.value_as_number > range_high or m.value_as_concept_id in (4155142, 4155143)

(the two concepts above are the SNOMED standard concepts from the 'Meas Value' domain representing abnormally high or abnormally low)

If you are looking for specific tests, you should pick the concepts that target those specific tests. I'm not sure, but it's possible that there's a hierarchy of test types in the Vocab, but I'm not familiar with that neck of the vocabulary to give guidance there. but if you leave the measurement criteria in CIRCE to say 'any Measurement', it will search for all measurements.

I've crated issue #46 for this, and I'll have a fix checked in shortly.

On the topic of "Sedating H1 antihistamines": Sounds like a drug classification, but I can't be sure. I'll poke around HERMES and see if I can find something that might look good. Another way of finding the drug class is to take a specific drug formulation (is there a specific drug that you can think of that has the 'sedating h1 antihistamine' effect? If so, search for that in Hermes, and look at the ancestors. If you can give me a specific branded drug example, I can look too.

-Chris

pbr6cornell commented 9 years ago

This may be a better extended discussion for the OHDSI forums ( forums.ohdsi.org). I think the bigger picture question is: 'what can we parse from the freetext inclusion criteria in CT.gov that can be translated into operational definitions in CIRCE which can be run against observational databases?'. It is highly unlikely, if not altogether impossible, to imagine that all inclusion criteria from clinical trials can or should be translated. For example, 'able to agree to consent' is not an applicable observational data construct. The example you provide, 'clinical significant abnormal lab results', unless further specification is provided (what is 'clinically significant'? what is 'abnormal'? which labs?), that is not something that is computable. However, it would be possible to specify a criteria like: "At most 0 observations with measurement of '<>' with value as number that is greater than upper limit of normal OR less than lower limit of normal." As I see it, the lion share of the work isn't on the CIRCE engine, it's going to fall to the CT.gov translation layer that needs to read the human-language inclusion criteria and apply heuristics to translate those into observational analysis statements. Thankfully, while CT.gov covers everything under the sun, the number of distinct structures used within the criteria is fairly finite, and you could start by marching down the more frequently occuring criteria as a starting point.

On Mon, Sep 28, 2015 at 10:15 PM, Tian312 notifications@github.com wrote:

Thanks for the reply.

E.g., "Sedating H1 antihistamines". I used HERMES to search for it, but showed no result. Only "antihistamines" is a concept in the CDM vocabulary but the criterion defines a more specific .

Many other examples in eligibility criteria result from the same reason, the different requirements for the concept granularity. Another example is "clinically significant abnormal lab results", which occurs in eligibility criteria with high frequency, but we could only found "lab result" as a concept using HERMES. However, "lab result" alone cannot make a meaningful criteria, we need to define at least "abnormal lab result".

— Reply to this email directly or view it on GitHub https://github.com/OHDSI/Circe/issues/45#issuecomment-143924745.

chrisknoll commented 8 years ago

Will close this but feel free to continue the discussion here, or we can move this to the OHDSI forums.