Closed ablack3 closed 5 months ago
using PP v 0.5.1
Correction: There are 'overall' records for drug exposure for windows (-Inf,-1) and (-365, -31). But NOT for window (-31, -1).
one question - would we expect the overall counts to be the sum of the strata counts. So for example in a specific time window for a specific drug, if we have 10 records for males and 5 records for females would we expect 15 records for overall?
hi @mderidder95 @ablack3 I think that your unexpected results are due to the minimumFrequency
argument https://darwin-eu-dev.github.io/PatientProfiles/reference/summariseLargeScaleCharacteristics.html
Only counts with a higher percentage than 0.5 are reported. This is set to reduce the large amount of data produced by this function, but can be turned of if minimumFrequency is set to 0.
The non-matching counts, I know that some individuals in iqvia have missing sex. And None will not be displayed if they are suppressed due to minimumCellCount or minimumFrequency. Note that when None is reported: e.g. concept = 41042861, he numbers do match
Thanks for the explanation Marti. We could try rerunning just this step of the study and set minimumFrequency = 0 and see if the results change as expected on iqvia.
Ok I think this is solved. Thank you @catalamarti !!
During the MDD study we ran large scale characteristics. In the output we get results for drug exposures within the various age and sex strata. However we do not get any results for the "overall" strata. Perhaps we are misunderstanding how this function should work. We expect that any records captured by a specific age or sex strata would also be captured by the "overall" strata.
In all databases except Iqvia we have drug results in the overall stata. In iqvia we have condition results in the overall strata but no drug results.
Here is the code we ran.
Here are the iqvia results. patient_characterisation.csv
When you filter for "overall" strata you will find no drug exposures.
It is possible this could require investigation to figure out and could be an issue with database rather than the software. So at this point let's first establish that the result is indeed unexpected.
@catalamarti would you expect that the overall strata would include any records captured by the age and sex strata? (i.e. "overall" includes all strata?)
Also tagging @mderidder95 who identified this issue.