Closed karafecho closed 6 months ago
Additional Slack exchanges with Hong revealed that this issue is not an enum issue, but rather an operator issue. Moreover, there is inconsistency in the various deployments, with prod apparently lagging behind dev.
From Hong: I have actually implemented filtering out value sets defined in all feature yaml file by the cohort definition with equal operator, but it is more convoluted for consolidating > and < operators between feature yaml definition and cohort definition. For example, if cohort definition has TotalEDInpatientVisits<5
, those rows with TotalEDInpatientVisits>=5
will be filtered out and not show up in the table output. However, >9 is not excluded for <10 cohort definition since I did not consolidate the two operators.
We agreed to add logic such that >9 will be changed to >=10 and, e.g., <9 changed to to <=8. This will support table output values to be constrained by a cohort definition operator of the same variable and filter out >9 rows for a cohort definition TotalEDInpatientVisits<10
.
Hong deployed the fix to prod. Kara will test.
Update:
Race_UNC
to Race
when testing the query. So, user error.{
"cohort_id": "COHORT:11",
"size": 767,
"features": {
"TotalEDInpatientVisits": {
"operator": ">=",
"value": "9"
}
}
},
But the Features endpoint is returning visit data for all enumerations and for what appears to be the complete dataset.
curl -X 'GET' \
'https://icees-pcd.renci.org/patient/cohort/COHORT%3A11/features' \
-H 'accept: text/tabular'
+----------------------------+---------+
| feature | count |
+============================+=========+
| TotalEDInpatientVisits = 0 | 3676 |
| | 31.65% |
+----------------------------+---------+
| TotalEDInpatientVisits = 1 | 811 |
| | 6.98% |
+----------------------------+---------+
| TotalEDInpatientVisits = 2 | 724 |
| | 6.23% |
+----------------------------+---------+
| TotalEDInpatientVisits = 3 | 560 |
| | 4.82% |
+----------------------------+---------+
| TotalEDInpatientVisits = 4 | 517 |
| | 4.45% |
+----------------------------+---------+
| TotalEDInpatientVisits = 5 | 404 |
| | 3.48% |
+----------------------------+---------+
| TotalEDInpatientVisits = 6 | 387 |
| | 3.33% |
+----------------------------+---------+
| TotalEDInpatientVisits = 7 | 360 |
| | 3.10% |
+----------------------------+---------+
| TotalEDInpatientVisits = 8 | 307 |
| | 2.64% |
+----------------------------+---------+
| TotalEDInpatientVisits = 9 | 551 |
| | 4.74% |
+----------------------------+---------+
| TotalEDInpatientVisits > 9 | 3316 |
| | 28.55% |
+----------------------------+---------+
@karafecho I just tested it again and can verify multivariate endpoint now works correctly with COHORT:6 and COHORT:11. See the screen capture to see what my request query look like. . Can you test it again? We can touch base on slack if you still see issues with the multivariate endpoint after the fix.
I can reproduce the issue with features endpoint as you reported above, but that is a different issue from my fix to this multivariate endpoint since they don't share the same code path. I'll look into that issue later, but I think it'd work better if we could separate them into two different issues since the fix to them would be different. My fix here only applies to the multivariate endpoint since the new logic I added only involves the interplay between the value sets retrieved from all feature yaml file and the cohort definition.
I can run the multivariate query on PROD. I had tested the query you noted and several others earlier today, but the multivariate endpoint didn't seem to be working properly. Correction: I tested the wrong endpoint. So, user error.
However, I'm still seeing issues. COHORT:11 is defined as TotalEDInpatientVisits>=9
in all patients (i.e., not those in year=2020
). As such, the multivariate endpoint should be returning results for TotalEDInpatientVisits>9
only. But this is what I'm seeing when restricting the query to year=2020
.
curl -X 'GET' \
'https://icees-pcd.renci.org/patient/cohort/COHORT%3A11/features?year=2020' \
-H 'accept: text/tabular'
+----------------------------+---------+
| feature | count |
+============================+=========+
| TotalEDInpatientVisits = 0 | 125 |
| | 11.15% |
+----------------------------+---------+
| TotalEDInpatientVisits = 1 | 64 |
| | 5.71% |
+----------------------------+---------+
| TotalEDInpatientVisits = 2 | 64 |
| | 5.71% |
+----------------------------+---------+
| TotalEDInpatientVisits = 3 | 59 |
| | 5.26% |
+----------------------------+---------+
| TotalEDInpatientVisits = 4 | 33 |
| | 2.94% |
+----------------------------+---------+
| TotalEDInpatientVisits = 5 | 38 |
| | 3.39% |
+----------------------------+---------+
| TotalEDInpatientVisits = 6 | 43 |
| | 3.84% |
+----------------------------+---------+
| TotalEDInpatientVisits = 7 | 35 |
| | 3.12% |
+----------------------------+---------+
| TotalEDInpatientVisits = 8 | 49 |
| | 4.37% |
+----------------------------+---------+
| TotalEDInpatientVisits = 9 | 74 |
| | 6.60% |
+----------------------------+---------+
| TotalEDInpatientVisits > 9 | 537 |
| | 47.90% |
+----------------------------+---------+
Am I missing something?
WRT the Features functionality, I'll create a new issue.
@karafecho The curl query you pasted above is for features endpoint, not for multivariate endpoint. Just tested multivariate endpoint again with year=2020 and COHORT:11 input, and here are the results I got which appears to work as expected.
+--------------------------+-----------+------------------------------------+-------------+
| TotalEDInpatientVisits | Sex | Race | frequency |
+==========================+===========+====================================+=============+
| = 9 | = Male | = Native Hawaiian/Pacific Islander | 0 |
+--------------------------+-----------+------------------------------------+-------------+
| = 9 | = Male | = Caucasian | 9 |
+--------------------------+-----------+------------------------------------+-------------+
| = 9 | = Male | = African American | 3 |
+--------------------------+-----------+------------------------------------+-------------+
| = 9 | = Male | = Asian | 0 |
+--------------------------+-----------+------------------------------------+-------------+
| = 9 | = Male | = Unknown | 1 |
+--------------------------+-----------+------------------------------------+-------------+
| = 9 | = Male | = American/Alaskan Native | 0 |
+--------------------------+-----------+------------------------------------+-------------+
| = 9 | = Female | = Native Hawaiian/Pacific Islander | 0 |
+--------------------------+-----------+------------------------------------+-------------+
| = 9 | = Female | = Caucasian | 20 |
+--------------------------+-----------+------------------------------------+-------------+
| = 9 | = Female | = African American | 2 |
+--------------------------+-----------+------------------------------------+-------------+
| = 9 | = Female | = Asian | 0 |
+--------------------------+-----------+------------------------------------+-------------+
| = 9 | = Female | = Unknown | 0 |
+--------------------------+-----------+------------------------------------+-------------+
| = 9 | = Female | = American/Alaskan Native | 1 |
+--------------------------+-----------+------------------------------------+-------------+
| = 9 | = Unknown | = Native Hawaiian/Pacific Islander | 0 |
+--------------------------+-----------+------------------------------------+-------------+
| = 9 | = Unknown | = Caucasian | 0 |
+--------------------------+-----------+------------------------------------+-------------+
| = 9 | = Unknown | = African American | 0 |
+--------------------------+-----------+------------------------------------+-------------+
| = 9 | = Unknown | = Asian | 0 |
+--------------------------+-----------+------------------------------------+-------------+
| = 9 | = Unknown | = Unknown | 0 |
+--------------------------+-----------+------------------------------------+-------------+
| = 9 | = Unknown | = American/Alaskan Native | 0 |
+--------------------------+-----------+------------------------------------+-------------+
| = 9 | = Other | = Native Hawaiian/Pacific Islander | 0 |
+--------------------------+-----------+------------------------------------+-------------+
| = 9 | = Other | = Caucasian | 0 |
+--------------------------+-----------+------------------------------------+-------------+
| = 9 | = Other | = African American | 0 |
+--------------------------+-----------+------------------------------------+-------------+
| = 9 | = Other | = Asian | 0 |
+--------------------------+-----------+------------------------------------+-------------+
| = 9 | = Other | = Unknown | 0 |
+--------------------------+-----------+------------------------------------+-------------+
| = 9 | = Other | = American/Alaskan Native | 0 |
+--------------------------+-----------+------------------------------------+-------------+
| > 9 | = Male | = Native Hawaiian/Pacific Islander | 0 |
+--------------------------+-----------+------------------------------------+-------------+
| > 9 | = Male | = Caucasian | 0 |
+--------------------------+-----------+------------------------------------+-------------+
| > 9 | = Male | = African American | 0 |
+--------------------------+-----------+------------------------------------+-------------+
| > 9 | = Male | = Asian | 0 |
+--------------------------+-----------+------------------------------------+-------------+
| > 9 | = Male | = Unknown | 0 |
+--------------------------+-----------+------------------------------------+-------------+
| > 9 | = Male | = American/Alaskan Native | 0 |
+--------------------------+-----------+------------------------------------+-------------+
| > 9 | = Female | = Native Hawaiian/Pacific Islander | 0 |
+--------------------------+-----------+------------------------------------+-------------+
| > 9 | = Female | = Caucasian | 0 |
+--------------------------+-----------+------------------------------------+-------------+
| > 9 | = Female | = African American | 0 |
+--------------------------+-----------+------------------------------------+-------------+
| > 9 | = Female | = Asian | 0 |
+--------------------------+-----------+------------------------------------+-------------+
| > 9 | = Female | = Unknown | 0 |
+--------------------------+-----------+------------------------------------+-------------+
| > 9 | = Female | = American/Alaskan Native | 0 |
+--------------------------+-----------+------------------------------------+-------------+
| > 9 | = Unknown | = Native Hawaiian/Pacific Islander | 0 |
+--------------------------+-----------+------------------------------------+-------------+
| > 9 | = Unknown | = Caucasian | 0 |
+--------------------------+-----------+------------------------------------+-------------+
| > 9 | = Unknown | = African American | 0 |
+--------------------------+-----------+------------------------------------+-------------+
| > 9 | = Unknown | = Asian | 0 |
+--------------------------+-----------+------------------------------------+-------------+
| > 9 | = Unknown | = Unknown | 0 |
+--------------------------+-----------+------------------------------------+-------------+
| > 9 | = Unknown | = American/Alaskan Native | 0 |
+--------------------------+-----------+------------------------------------+-------------+
| > 9 | = Other | = Native Hawaiian/Pacific Islander | 0 |
+--------------------------+-----------+------------------------------------+-------------+
| > 9 | = Other | = Caucasian | 0 |
+--------------------------+-----------+------------------------------------+-------------+
| > 9 | = Other | = African American | 0 |
+--------------------------+-----------+------------------------------------+-------------+
| > 9 | = Other | = Asian | 0 |
+--------------------------+-----------+------------------------------------+-------------+
| > 9 | = Other | = Unknown | 0 |
+--------------------------+-----------+------------------------------------+-------------+
| > 9 | = Other | = American/Alaskan Native | 0 |
+--------------------------+-----------+------------------------------------+-------------+
Oh, right, duh. Sorry, I've been in back-to-back meetings since 8:30 am this morning and am a bit brain dead.
I can reproduce the results you posted above, so let's consider this issue resolved.
Closing with comment.
This issue is to report a potential operator and/or enumeration type mishandling at the ICEES+ cohort creation and/or multivariate endpoints.
Specifically, the table below was generated at the ICEES+ PCD multivariate endpoint using a cohort that correctly restricted the patients to those in
year=2020
and TotalEDInpatientVisits<=9. You'll see that there are two blocks whereTotalEDInpatientVisits=9
. I believe that the second block is actuallyTotalEDInpatientVisits>9
, as the numbers agree with those that I see in the underlying data. Recall that our enumeration forTotalEDInpatientVisit
s is 0, 1, ...9, '>9'. While the COHORT was created properly (N=4569), the multivariate call is distinguishingTotalEDInpatientVisits=9
vs TotalEDInpatientVisits>9 but not excludingTotalEDInpatientVisits>9
(total frequency > 4569).table4.1B (1).pdf