ExposuresProvider / icees-api

MIT License
2 stars 8 forks source link

PCD endpoint exposes EPR/PEGS variables #283

Closed karafecho closed 10 months ago

karafecho commented 11 months ago

This issue is to report that the PCD endpoint is incorrectly exposing EPR/PEGS variables. The variables are empty (meaning all values are zero), which is expected as the EPR/PEGS data are available only for the asthma and covid cohorts. As such, the last join step should not be run for the PCD dataset.

Example:


curl -X 'GET' \
  'https://icees-pcd.renci.org/patient/cohort/COHORT%3A1/features' \
  -H 'accept: text/tabular'

EPR/PEGS variables:

+--------------------------------+---------+
| feature                        | count   |
+================================+=========+
| D28B_STILL_HAVE_ASTHMA = True  | 0       |
|                                | 0.00%   |
+--------------------------------+---------+
| D28B_STILL_HAVE_ASTHMA = False | 0       |
|                                | 0.00%   |
+--------------------------------+---------+
| D28B_STILL_HAVE_ASTHMA = None  | 335686  |
|                                | 100.00% |
+--------------------------------+---------+
+---------------------------------+---------+
| feature                         | count   |
+=================================+=========+
| D28C_ASTHMA_EPISODE_12M = True  | 0       |
|                                 | 0.00%   |
+---------------------------------+---------+
| D28C_ASTHMA_EPISODE_12M = False | 0       |
|                                 | 0.00%   |
+---------------------------------+---------+
| D28C_ASTHMA_EPISODE_12M = None  | 335686  |
|                                 | 100.00% |
+---------------------------------+---------+
+----------------------------------+---------+
| feature                          | count   |
+==================================+=========+
| D28D_ASTHMA_ER_VISIT_12M = True  | 0       |
|                                  | 0.00%   |
+----------------------------------+---------+
| D28D_ASTHMA_ER_VISIT_12M = False | 0       |
|                                  | 0.00%   |
+----------------------------------+---------+
| D28D_ASTHMA_ER_VISIT_12M = None  | 335686  |
|                                  | 100.00% |
+----------------------------------+---------+
hyi commented 11 months ago

@karafecho Looks like this issue is caused by the PCD feature yaml file as well. For example, this feature variable D28B_STILL_HAVE_ASTHMA is defined in the feature yaml file. If we remove these feature variables from the feature yaml file, these EPR/PEGS variables should not show up.

karafecho commented 11 months ago

Kara to update YAML file after Hong writes a script to create a diff file showing discrepancies in variables between the patient dataset and the all_features YAML file.

karafecho commented 10 months ago

Complete, closing issue.