callahantiff / PheKnowVec

Translational Computational Phenotyping
2 stars 0 forks source link

Query Verification: Peanut Allergy Cohort #104

Open callahantiff opened 5 years ago

callahantiff commented 5 years ago

@mgkahn - Can you please help me verify the query to select Peanut Allergy patients?


COHORT CRITERIA Case Criteria:
CASE TYPE 1

CASE TYPE 2


Control Criteria:



Cohort Logic Table

COHORT CHUNK LOGICAL OPERATOR
CASE TYPE 1 all_case_inclusion_criteria_1 AND
CASE TYPE 1 mx_case_inclusion_criteria_1 OR
CASE TYPE 1 px_inclusion_criteria_1 ---
CASE TYPE 2 all_case_inclusion_criteria_1 AND
CASE TYPE 2 px_inclusion_criteria_2 OR
CASE TYPE 2 px_inclusion_criteria_3 OR
CASE TYPE 2 px_inclusion_criteria_4 ---

NOTE.

  1. Replace all instances of {database} with CHCO_DeID_Oct2018
  2. Comment out all instances of {code_set_group}

Query can be found here and is also included below:

WITH all_case_inclusion_criteria_1 AS (
SELECT person_id, {code_set_group} AS code_set FROM
(SELECT co.person_id
FROM 
  {database}.condition_occurrence co, 
  {database}.PEANUTALLERGY_COHORT_VARS cohort 
WHERE 
  co.condition_concept_id = CAST(cohort.standard_concept_id AS int64) 
  AND cohort.phenotype_definition_number = 1 
  AND cohort.standard_code_set = {code_set_group}
GROUP BY 
  co.person_id, cohort.standard_code_set
HAVING 
  COUNT(DISTINCT co.condition_concept_id) >= 1
UNION DISTINCT
SELECT de.person_id
FROM 
  {database}.drug_exposure de, 
  {database}.PEANUTALLERGY_COHORT_VARS cohort 
WHERE 
  de.drug_concept_id = CAST(cohort.standard_concept_id AS int64) 
  AND cohort.phenotype_definition_number = 1 
  AND cohort.standard_code_set = {code_set_group}
GROUP BY 
  de.person_id, cohort.standard_code_set
HAVING 
  COUNT(DISTINCT de.drug_concept_id) >= 1
UNION DISTINCT
SELECT m.person_id
FROM 
  {database}.measurement m, 
  {database}.PEANUTALLERGY_COHORT_VARS cohort 
WHERE 
  m.measurement_concept_id = CAST(cohort.standard_concept_id AS int64) 
  AND cohort.phenotype_definition_number = 1 
  AND cohort.standard_code_set = {code_set_group}
GROUP BY 
  m.person_id, cohort.standard_code_set
HAVING 
  COUNT(DISTINCT m.measurement_concept_id) >= 1
UNION DISTINCT
SELECT o.person_id
FROM 
  {database}.observation o, 
  {database}.PEANUTALLERGY_COHORT_VARS cohort 
WHERE 
  o.observation_concept_id = CAST(cohort.standard_concept_id AS int64) 
  AND cohort.phenotype_definition_number = 1 
  AND cohort.standard_code_set = {code_set_group}
GROUP BY 
  o.person_id, cohort.standard_code_set
HAVING 
  COUNT(DISTINCT o.observation_concept_id) >= 1
UNION DISTINCT
SELECT pr.person_id
FROM 
  {database}.procedure_occurrence pr, 
  {database}.PEANUTALLERGY_COHORT_VARS cohort 
WHERE 
  pr.procedure_concept_id = CAST(cohort.standard_concept_id AS int64) 
  AND cohort.phenotype_definition_number = 1 
  AND cohort.standard_code_set = {code_set_group}
GROUP BY 
  pr.person_id, cohort.standard_code_set
HAVING 
  COUNT(DISTINCT pr.procedure_concept_id) >= 1
)),

mx_case_inclusion_criteria_1 AS (
SELECT m.person_id, cohort.standard_code_set AS code_set
FROM 
  {database}.measurement m,
  {database}.PEANUTALLERGY_COHORT_VARS cohort
WHERE 
  m.measurement_concept_id = CAST(cohort.standard_concept_id AS int64)
  AND cohort.phenotype_definition_number = 2
  AND cohort.standard_code_set = {code_set_group}
GROUP BY m.person_id, cohort.standard_code_set
HAVING 
  COUNT(DISTINCT m.measurement_concept_id) >= 1
),

px_case_inclusion_criteria_1 AS (
SELECT pr.person_id, cohort.standard_code_set AS code_set
FROM 
  {database}.procedure_occurrence pr,
  {database}.PEANUTALLERGY_COHORT_VARS cohort
WHERE 
  pr.procedure_concept_id = CAST(cohort.standard_concept_id AS int64)
  AND cohort.phenotype_definition_number = 3
  AND cohort.standard_code_set = {code_set_group}
GROUP BY pr.person_id, cohort.standard_code_set
HAVING 
  COUNT(DISTINCT pr.procedure_concept_id) >= 1
),

px_case_inclusion_criteria_2 AS (
SELECT pr.person_id, cohort.standard_code_set AS code_set
FROM 
  {database}.procedure_occurrence pr,
  {database}.PEANUTALLERGY_COHORT_VARS cohort
WHERE 
  pr.procedure_concept_id = CAST(cohort.standard_concept_id AS int64)
  AND cohort.phenotype_definition_number = 4
  AND cohort.standard_code_set = {code_set_group}
GROUP BY pr.person_id, cohort.standard_code_set
HAVING 
  COUNT(DISTINCT pr.procedure_concept_id) >= 1
),

px_case_inclusion_criteria_3 AS (
SELECT pr.person_id, cohort.standard_code_set AS code_set
FROM 
  {database}.procedure_occurrence pr,
  {database}.PEANUTALLERGY_COHORT_VARS cohort
WHERE 
  pr.procedure_concept_id = CAST(cohort.standard_concept_id AS int64)
  AND cohort.phenotype_definition_number = 5
  AND cohort.standard_code_set = {code_set_group}
GROUP BY pr.person_id, cohort.standard_code_set
HAVING 
  COUNT(DISTINCT pr.procedure_concept_id) >= 1
),

px_case_inclusion_criteria_4 AS (
SELECT pr.person_id, cohort.standard_code_set AS code_set
FROM 
  {database}.procedure_occurrence pr,
  {database}.PEANUTALLERGY_COHORT_VARS cohort
WHERE 
  pr.procedure_concept_id = CAST(cohort.standard_concept_id AS int64)
  AND cohort.phenotype_definition_number = 6
  AND cohort.standard_code_set = {code_set_group}
GROUP BY pr.person_id, cohort.standard_code_set
HAVING 
  COUNT(DISTINCT pr.procedure_concept_id) >= 1
)

SELECT * FROM
  (SELECT person_id, code_set, 'PEANUTALLERGY_CASE_TYPE1' AS cohort_type
    FROM (
    SELECT person_id, code_set FROM all_case_inclusion_criteria_1
      INTERSECT DISTINCT 
     (SELECT person_id, code_set FROM mx_case_inclusion_criteria_1
      UNION DISTINCT
      SELECT person_id, code_set FROM px_case_inclusion_criteria_1)
   GROUP BY person_id, code_set, cohort_type))

  UNION ALL

  (SELECT person_id, code_set, 'PEANUTALLERGY_CASE_TYPE2' AS cohort_type
    FROM (
     SELECT person_id, code_set FROM all_case_inclusion_criteria_1
      INTERSECT DISTINCT 
    (SELECT person_id, code_set FROM px_case_inclusion_criteria_2
      UNION DISTINCT
     SELECT person_id, code_set FROM px_case_inclusion_criteria_3
      UNION DISTINCT
     SELECT person_id, code_set FROM px_case_inclusion_criteria_4))
   GROUP BY person_id, code_set, cohort_type)
;
mgkahn commented 5 years ago

CPT codes: Just as an aside -- data from the hospitals will have spotty coverage of CPT codes, which are much more complete in CU Medicine. So the presence of the allergy testing CPT codes in UCHealth data may be very small if any. Not because it wasn't done but because it is captured in professional billing, not hospital billing

mgkahn commented 5 years ago

No additional comments on peanut allergy other than above.

callahantiff commented 5 years ago

CPT codes: Just as an aside -- data from the hospitals will have spotty coverage of CPT codes, which are much more complete in CU Medicine. So the presence of the allergy testing CPT codes in UCHealth data may be very small if any. Not because it wasn't done but because it is captured in professional billing, not hospital billing

Interesting! Would this be the same case for CHCO and MIMIC? We are not currently using UCHealth data for this project (although I'd love to)!

mgkahn commented 5 years ago

Same for CHCO although in the past CHCO did do some physician CPT coding. But I think that's been taken over completely by CU Medicine now. I don't know about MIMIC. You would need to rummage around in their data dictionary.

callahantiff commented 5 years ago

Great, it sounds like we can close this issues and wrap this up. Please re-open if you disagree or think that there is more to be done.

mgkahn commented 5 years ago

Can this be re-opened? I do not think the inclusion criterial logic is correct unless there is some missing English. Does CASE TYPE 1 require that a patient meet all three inclusion criteria? CASE TYPE 2 clearly has an "OR" for criteria 2, 3, & 4 but it isn't clear from the English if all three criteria for CASE TYPE 1 must be met. If so, then need to change logic.

callahantiff commented 5 years ago

Can this be re-opened? I do not think the inclusion criterial logic is correct unless there is some missing English. Does CASE TYPE 1 require that a patient meet all three inclusion criteria? CASE TYPE 2 clearly has an "OR" for criteria 2, 3, & 4 but it isn't clear from the English if all three criteria for CASE TYPE 1 must be met. If so, then need to change logic.

Fine to re-open, but I feel confident that we have this correct. Here is the original criteria that was provided:

Screen Shot 2019-08-11 at 12 17 00

Do you not agree that this is what I have represented with the logic above?

mgkahn commented 5 years ago

I’d rewrite the criteria to use indented bullets for the OR to make it clear like it is for Definition 2. Else you have been using ANDs between inclusion criteria bullets in your other definitions.


From: Tiffany J. Callahan notifications@github.com Sent: Sunday, August 11, 2019 12:19:58 PM To: callahantiff/PheKnowVec PheKnowVec@noreply.github.com Cc: Kahn, Michael MICHAEL.KAHN@CUANSCHUTZ.EDU; State change state_change@noreply.github.com Subject: Re: [callahantiff/PheKnowVec] Query Verification: Peanut Allergy Cohort (#104)

Can this be re-opened? I do not think the inclusion criterial logic is correct unless there is some missing English. Does CASE TYPE 1 require that a patient meet all three inclusion criteria? CASE TYPE 2 clearly has an "OR" for criteria 2, 3, & 4 but it isn't clear from the English if all three criteria for CASE TYPE 1 must be met. If so, then need to change logic.

Fine to re-open, but I feel confident that we have this correct. Here is the original criteria that was provided:

[Screen Shot 2019-08-11 at 12 17 00]https://user-images.githubusercontent.com/8030363/62837901-43007180-bc32-11e9-97ae-d0009aed86b9.png

Do you not agree that this is what I have represented with the logic above?

— You are receiving this because you modified the open/close state. Reply to this email directly, view it on GitHubhttps://github.com/callahantiff/PheKnowVec/issues/104?email_source=notifications&email_token=AA557TSDW5MONSKMSFQ4DLTQEBJ45A5CNFSM4IF3NOCKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD4BF7RY#issuecomment-520249287, or mute the threadhttps://github.com/notifications/unsubscribe-auth/AA557TQ4B6X4NEEFAMJU2NLQEBJ45ANCNFSM4IF3NOCA.

callahantiff commented 5 years ago

I’d rewrite the criteria to use indented bullets for the OR to make it clear like it is for Definition 2. Else you have been using ANDs between inclusion criteria bullets in your other definitions.

I see, but that gets kind of confusing too (I had initially tried to apply something like this). Do you not think that the table is helpful?

Cohort Logic Table

COHORT CHUNK LOGICAL OPERATOR
CASE TYPE 1 all_case_inclusion_criteria_1 AND
CASE TYPE 1 mx_case_inclusion_criteria_1 OR
CASE TYPE 1 px_inclusion_criteria_1 ---
CASE TYPE 2 all_case_inclusion_criteria_1 AND
CASE TYPE 2 px_inclusion_criteria_2 OR
CASE TYPE 2 px_inclusion_criteria_3 OR
CASE TYPE 2 px_inclusion_criteria_4 ---

Case Criteria:
CASE TYPE 1

CASE TYPE 2


I could also add mathematical notation:

CASE_TYPE1 = all_case_inclusion_criteria_1 ∩ (mx_case_inclusion_criteria_1 ∪ px_inclusion_criteria_1)

CASE_TYPE2 = all_case_inclusion_criteria_1 ∩ (px_inclusion_criteria_2 ∪ px_inclusion_criteria_3 ∪ px_inclusion_criteria_4)

callahantiff commented 5 years ago

TODO FOR ME: