Open callahantiff opened 5 years ago
rx_case_inclusion_criteria_1:
dx_case_inclusion_criteria_1: Probably correct but checking -- your query only looks at the FIRST Dx [min(condition_starttime)], which may have occurred before the steroid time interval. You would not pick up instances of the condition that did occur within the steroid time interval. Thinking clinically, I think this is right -- if there is a Dx of necrosis BEFORE the steroid, then it is hard to call this steroid induced necrosis. So I think the logic you have here is right but just wanted to be explicit that this is what your logic is doing.
all_case_inclusion_criteria_1
Remaining all_case_inclusion -- I see you used the same "shell" that looks at conditions, medications, observations, procedures even when the code set probably is "looking" only at a single domain (e.g. conditions). This is fine even if inefficient and more expensive. I suspect folks not using GBQ would object to such brute force queries. But it does work since the concept codes will only match in the right domains.
rx_control_inclusion_criteria_1
Overall logic: I think you have the logic for exclusions wrong for both cases and controls. Any exclusion removes a patient. So you need to "OR" (= UNION DISTINCT) all of the exclusions so that a patient_id that appears in ANY exclusion is included in your "EXCEPT DISTINCT" subquery. As currently written, a patient would only be excluded if they met ALL exclusion criteria.
End of mgk comments on steroid induced osteonecrosis
rx_case_inclusion_criteria_1:
- You don't need "de.route_concept_id = c.concept_id" because you never use anything from the concept table for the route. You've already decoded the routes that you want into concept_ids.
Good point, I will remove that.
- While it doesn't change anything analytically, it probably would be easier to debug if you did your group by order_date, end_date (rather than currently end_date, order_date)so that the list would be sorted first by order date.
OK, I will update accordingly.
- You are using "raw" drug orders. If you created and used drug_eras instead, you may get more intervals that are longer than 14 days because eras "merge" shorter intervals together. So, while an individual order may not meet 14 days, a drug era might. You would need to run the OHDSI provided algorithm for drug era. Jonathan Derkermajian can help you navigate this code if you wish to try it out.
That's a great idea! I wonder if it also makes sense to do that for conditions too? I won't have that in time for our AMIA Informatics Summit, but I can adjust the queries and implement this afterwards. I created issue #114
For rx_case_inclusion_criteria_1
and rx_control_inclusion_criteria_1
:
WITH rx_case_inclusion_criteria_1 AS (
SELECT de.person_id, cohort.standard_code_set AS code_set
FROM
{database}.drug_exposure de,
{database}.concept c,
{database}.STEROIDINDUCEDOSTEONECROSIS_COHORT_VARS cohort
WHERE de.route_concept_id IN (4132161, 4171047, 4302612)
AND de.drug_concept_id = CAST(cohort.standard_concept_id AS int64)
AND cohort.phenotype_definition_number = 1
AND cohort.standard_code_set = {code_set_group}
GROUP BY
de.person_id, code_set, de.drug_exposure_order_datetime, de.drug_exposure_end_datetime
HAVING
COUNT(DISTINCT de.drug_concept_id) >= 1
AND DATETIME_DIFF(DATETIME(de.drug_exposure_end_datetime), DATETIME(de.drug_exposure_order_datetime), day) >= 14
),
dx_case_inclusion_criteria_1: Probably correct but checking -- your query only looks at the FIRST Dx [min(condition_starttime)], which may have occurred before the steroid time interval. You would not pick up instances of the condition that did occur within the steroid time interval. Thinking clinically, I think this is right -- if there is a Dx of necrosis BEFORE the steroid, then it is hard to call this steroid induced necrosis. So I think the logic you have here is right but just wanted to be explicit that this is what your logic is doing.
Thanks for thinking about this so critically, it sounds like we are in the same page about this. I was intending for the query to run this way 😄.
all_case_inclusion_criteria_1
- Does this run successfully? Your UNIONS return only person_ids but your outer SELECT returns person_id and code_set_group. I just don't know if this runs. If not, you may need to add {code_set_group} to each of your subqueries so they return person_id, {code_set_group}.
- You use cohort.phenotype_definition_number =9 for all subqueries, which use concept_codes from many different OMOP tables/domains. This should be fine since you will only match on concept_codes for the domain that matches the table domain. In other queries, you used different cohort.phenotype_definition_numbers for each data domain. I think what you have here is fine since the right domain codes will be found in the right tables (and you'll pick up any codes that are being stored in the wrong table!).
- BTW: you do the same in all_case_exclusion_criteria_1 and many others. Just check that the query runs correctly or if it needs to be adjusted as I mention in the first bullet. It's an easy fix if needed.
Bullets 1 and 3: Thanks for pointing this out. I did run the query confirm that it returns the same rows as when adding in the missing argument to the subqueries. That said, I agree that this seems sketchy so I updated all all_...
queries that use the same template to be the same. This should be good to go!
Bullet 2: I agree that this is a brute-force approach and not the most elegant, but it works 😄! I am happy to change if there is a different way that you would like me to write this. Just let me know.
Remaining all_case_inclusion -- I see you used the same "shell" that looks at conditions, medications, observations, procedures even when the code set probably is "looking" only at a single domain (e.g. conditions). This is fine even if inefficient and more expensive. I suspect folks not using GBQ would object to such brute force queries. But it does work since the concept codes will only match in the right domains.
Consistent with response above, I agree that this is a brute-force approach and not the most elegant, but it works 😄! I am happy to change if there is a different way that you would like me to write this. Just let me know.
Overall logic: I think you have the logic for exclusions wrong for both cases and controls. Any exclusion removes a patient. So you need to "OR" (= UNION DISTINCT) all of the exclusions so that a patient_id that appears in ANY exclusion is included in your "EXCEPT DISTINCT" subquery. As currently written, a patient would only be excluded if they met ALL exclusion criteria.
This might be a bigger question on my end then. I take it that unless you commented, the other queries looked OK? I am only asking so I know whether or not I need to go back the other queries.
So, if the logical definitions says: "NOT presence of ANY A AND NOT presence of ANY B AND NOT presence of ANY C"
You would write this like?
EXCEPT DISTINCT
(A UNION DISTINCT B UNION DISTINCT C)
It looks like MIMIC-OMOP does not have de.drug_exposure_order_datetime
in the drug_exposure
table. Should I substitute it for de.drug_exposure_start_datetime
for both CHCO and MIMIC-OMOP?
Yes. Next best alternative.
NOTE: NEW EMAIL ADDRESS
Michael G. Kahn MD, PhD Professor with Tenure, Section of Clinical Informatics, Department of Pediatrics Co-Director & Translational Informatics Core Director, Colorado Clinical and Translational Sciences Institute Director, Health Data Compass Associate Director, Colorado Center for Personalized Medicine University of Colorado Denver Anschutz Medical Campus Aurora, Colorado 80045 USA
Medical Director, Research Informatics Children's Hospital Colorado Research Institute Aurora, Colorado
E: Michael.Kahn@cuanschutz.edumailto:Michael.Kahn@cuanschutz.edu | P: 303-724-8334 www.cuanschutz.eduhttp://www.cuanschutz.edu
[CU Anschutz Website]https://www.cuanschutz.edu/
[CU Anschutz Facebook]https://www.facebook.com/CUAnschutzMed/ [CU Anschutz Instagram] https://www.instagram.com/cuanschutz/ [CU Anschutz Linkedin] https://www.linkedin.com/school/university-of-colorado-anschutz-medical-campus/ [CU Anschutz Twitter] https://twitter.com/CUAnschutz
From: "Tiffany J. Callahan" notifications@github.com Reply-To: callahantiff/PheKnowVec reply@reply.github.com Date: Monday, August 12, 2019 at 1:39 PM To: callahantiff/PheKnowVec PheKnowVec@noreply.github.com Cc: "Kahn, Michael" MICHAEL.KAHN@CUANSCHUTZ.EDU, Mention mention@noreply.github.com Subject: Re: [callahantiff/PheKnowVec] Query Verification: Steroid Induced Osteonecrosis Cohort (#107)
It looks like MIMIC-OMOP does not have de.drug_exposure_order_datetime in the drug_exposure table. Should I substitute it for de.drug_exposure_start_datetime for both CHCO and MIMIC-OMOP?
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://github.com/callahantiff/PheKnowVec/issues/107?email_source=notifications&email_token=AA557TRZOW6O6W6HOPVIR5DQEG347A5CNFSM4IF3SZS2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD4DTH2I#issuecomment-520565737, or mute the threadhttps://github.com/notifications/unsubscribe-auth/AA557TR3R5XUR2TI3RTQ3V3QEG347ANCNFSM4IF3SZSQ.
TODO FOR ME:
@mgkahn - Can you please help me verify the query to select Steroid Induced Osteonecrosis patients?
COHORT CRITERIA Case Criteria:
de.route_concept_id==4171047
), IM (de.route_concept_id==4302612
) or PO (de.route_concept_id==4132161
) ≥ 14 days cumulative) (drug strings - criteria # 1)Control Criteria:
de.route_concept_id==4171047
), IM (de.route_concept_id==4302612
) or PO (de.route_concept_id==4132161
) ≥ 14 days cumulative) (medication strings - criteria # 10)Cohort Logic Table
NOTE.
{database}
withCHCO_DeID_Oct2018
{code_set_group}
Query can be found here and is also included below: