OHDSI / Aphrodite

[in development]
Apache License 2.0
37 stars 15 forks source link

I found an issue in the getdPatientCohort function. The current versi… #19

Open udaraabeysekara opened 3 years ago

udaraabeysekara commented 3 years ago

I found two issues in the code, and the following is the explanation of the problem.

Issue 1: Following is a section of the code used in the current version.

WHERE observation_concept_id IN (", paste(includeConceptlist,collapse=","), ") AND observation_concept_id NOT IN (", paste(excludeConceptlist,collapse=",")

The expected outcome is to find patient_ids who have at least one observation_concept_id in the includeConceptlist but have not in the excludeConceptlist. However, this part does not give the expected outcome. The current code corrected this issue using a join.

Issue 2: A given patient might have an observation_concept_id in the include concept list. However, the same person might have an exclusion concept in the condition_concept_id. Therefore, one has to combine both observation and condition_occurrence tables. The solution is to combine the observation and condition_occurrence tables using union all. Also, note that if someone is including measurement_concept_id, drug_concept_id, procedure_concept_id, and note_nlp_concept_ids in the inclusion/exclusion lists, we have to combine all tables using union all before checking for inclusion and exclusion criteria.

Solution. We need to fix three places. Place 1: The section from line 193 to 196. This section has both issues 1 and 2. One has to combine the observation and condition_occurrence tables. However, this requires changing line 233. I am afraid that could cause confusions. Therefore, in the current version, I only fix issue 1. If you agree with my logic. I will fix the issue 2 too.

Place 2: The section from line 198 to 230. I am not touching this section now. If you agree with my solution we could revisit this section.

Place 3: Line 240. This line also suffers from both issues 1 and 2. I fixed both problems in the new version. Note that I use a CTE to increase the clarity of the coding. This helps me to keep the code short and easy to read. However, CTEs are sometimes known to slow down the query. If CTEs are slowing you too much, we could modify this part easily. Also, note that I union six tables. I have two dashes in front of every union. During testing, one can comment on sections as needed.

Signed-off-by: udaraabeysekara udaraabeysekara@yahoo.com