callahantiff / PheKnowVec

Translational Computational Phenotyping
2 stars 0 forks source link

Project Meeting -- 07/29/2019 @ 15:00 #109

Closed callahantiff closed 5 years ago

callahantiff commented 5 years ago

Meeting Date: 07/29/2019 Topic: Biweekly Meeting Attendees: @mgkahn

Proposed Agenda:

callahantiff commented 5 years ago

Meeting Date: 07/29/2019 Topic: Biweekly Meeting Attendees: @mgkahn

Proposed Agenda:

  • Briefly discuss cohort queries, with the goal of understanding whether or not I need someone to verify my queries. Queries can be found here:

    • ADHD - #100
    • Appendicitis - #101
    • Crohn's Disease - #102
    • Hypothyroidism - #103
    • Peanut Allergy - #104
    • Sickle Cell Disease - #105
    • Sleep Apnea - #106
    • Steroid-Induced Osteonecrosis - #107
    • Systemic Lupus Erythematosus - #108
  • Discuss ADHD results (see first tab of the spreadsheet here) and what we need to do to "confirm" or "validate" the cohorts produced by each method. The tabs for the other phenotypes are present in this spreadsheet, but are not finalized. Please ignore them for now.

UPDATE: Great meeting today! I emailed the queries that need verification (and are referenced above). As a record, those queries are also provided below:

Thanks for being willing to take a look at these (listed in order from most to least confident):

  1. ADHD: https://github.com/callahantiff/PheKnowVec/issues/100
  2. Sickle Cell Disease: https://github.com/callahantiff/PheKnowVec/issues/105
  3. Sleep Apnea: https://github.com/callahantiff/PheKnowVec/issues/106
  4. Systemic Lupus Erythematosus: https://github.com/callahantiff/PheKnowVec/issues/108
  5. Crohn's Disease: https://github.com/callahantiff/PheKnowVec/issues/102
  6. Peanut Allergy: https://github.com/callahantiff/PheKnowVec/issues/104
  7. Steroid-Induced Osteonecrosis: https://github.com/callahantiff/PheKnowVec/issues/107
  8. Appendicitis: https://github.com/callahantiff/PheKnowVec/issues/101
  9. Hypothyroidism: https://github.com/callahantiff/PheKnowVec/issues/103

For each query, would you mind taking a look at each logical statement and it’s corresponding code chunk (example shown below) just to make sure that I have properly covered it and didn’t do something silly?

Logical Statement: age_criteria_1: Patient is 1460 days or older (age at earliest diagnosis of ADHD) Matching SQL code chunk:

age_criteria_1 AS (
SELECT co.person_id, cohort.standard_code_set AS code_set 
FROM 
  {database}.person p, 
  {database}.condition_occurrence co, 
  {database}.ADHD_COHORT_VARS cohort 
WHERE p.person_id = co.person_id 
  AND co.condition_concept_id = CAST(cohort.standard_concept_id AS int64) 
  AND cohort.phenotype_definition_number = 1
  AND cohort.standard_code_set = {code_set_group}
GROUP BY
  co.person_id, p.birth_datetime, cohort.standard_code_set
HAVING
  DATETIME_DIFF(DATETIME(MIN(co.condition_start_datetime)), DATETIME(p.birth_datetime), DAY) >= 1460
)

database and code_set_group are looped over when running the code from python. code_set_group is for each of the mapping strategies we are evaluating. The phenotype_definition_number is a number I have linked to all of the “ADHD codes”, this approached is utilized to help me keep track of all of the different codes.

To help keep track of the different code chunks and how they are combined, I have added a logic table for each cohort, their “chunks” and how those should be combined. For example, here is the table for ADHD:

COHORT CHUNK LOGICAL OPERATOR
CASE TYPE 1 age_criteria_1 AND
CASE TYPE 1 dx_case_inclusion_criteria_1 AND
CASE TYPE 1 rx_case_inclusion_criteria_1 AND
CASE TYPE 1 dx_case_exclusion_criteria_1 ---
     
CASE TYPE 2 age_criteria_1 AND
CASE TYPE 2 dx_case_inclusion_criteria_2 AND
CASE TYPE 2 dx_case_exclusion_criteria_1 ---
     
CONTROL age_criteria_2 AND
CONTROL visit_criteria_1 AND
CONTROL rx_control_exclusion_criteria_1 OR
CONTROL dx_control_exclusion_criteria_1 OR
CONTROL dx_control_exclusion_criteria_2

Closing this issue. Please re-open if there is anything that needs clarification.