ExposuresProvider / icees-api-config

Other
0 stars 1 forks source link

Packed issue re discrepancy between Cohort Discovery and 2x2 associations, year issue #112

Closed karafecho closed 6 months ago

karafecho commented 6 months ago

I am preparing a MS to highlight ICEES, CAM KP, and ROBOKOP. As part of that effort, I planned to replicate the findings in Xu et al. 2020 and then look to see whether I could determine, for example, if the racial disparities were greater in one sex vs the other. In so doing, I uncovered what appears to be a complicated bug(s). Specifically, I can create cohorts of AfricanAmerican, AfricanAmerican/Male, and AfricanAmerican/Female, and the sample sizes appear to be correct. However, when I ran a 2x2 with each of the cohorts, I receive the exact same output. Moreover, there's something funky going on with year, in that when I don't specify year in the 2x2, the sample size is smaller than when I do specify year.

Here are the Xu et al. 2020 findings: image

Here's the published query that generated those findings. image

And here's some of my testing results: ICEES_Visit-PM2.5_AfricanAmerican-Male-Female_query-results FOR HONG.txt

karafecho commented 6 months ago

Updated with results from Caucasians ... weird. ICEES_Visit-PM2.5_AfricanAmerican-Male-Female_query-results FOR HONG.txt

karafecho commented 6 months ago

Note to self: the Xu et al. 2020 results were generated using qcut, not cut. I tested using cut. That does not resolve the issue(s) here, but rather is intended to serve as a reminder for me.

karafecho commented 6 months ago

ICEES_Visit-PM2.5_AfricanAmerican-Male-Female_query-results-DEV.txt

karafecho commented 6 months ago

From Hong: _The cause is that countunique() is reading results from cache, hence always returning the same results as the first execution result. This explains why I could not reproduce the results on the production server initially since the previous result was only cached on prod, but not on dev or my local environment. The fix is to remove the cache decorator so that the previous result will not be cached and it always execute query again without reading results from cache. I will deploy the results to both asthma dev and prod servers shortly so that you can test out again to see if you can reproduce the published results.

Cache decorator removed from all ICEES+ instances. Redeployed. Tested.

Closing issue ...