Error in Cohort Characterization tab

ohdsi-studies / PioneerWatchfulWaiting

This study is part of the joint PIONEER - EHDEN - OHDSI studyathon in March 2021, and aims to advance understanding of clinical management and outcomes of watchful waiting in prostate cancer.

Apache License 2.0

7 stars 18 forks source link

Error in Cohort Characterization tab #94

Open bdemeulder opened 3 years ago

bdemeulder commented 3 years ago

In the shiny app cohort characterisation, there seems to be an error in parsing the results.

Further, when looking only at the demographics domain, it seems the rows are replicated many times in the table (e.g. more than 170 occurences of gender=male)

Compare cohorts characteristics seems fine.

MaximMoinat commented 3 years ago

That indeed looks like a join/merge gone wrong, but I can't reproduce this with the data on my side. What premerge.R data file are you using? @bdemeulder

bdemeulder commented 3 years ago

@MaximMoinat I'm using the one I've just uploaded to teams (task #5/Processed results files/results)

MaximMoinat commented 3 years ago

I found the issue, the function getCovariateDataSubset returns data for all cohorts while it should filter for the selected cohort. So the duplication that we are seeing are the covariates for all cohorts. We are not seeing this in the cohort comparison tab as this filters in a different way.

https://github.com/ohdsi-studies/PioneerWatchfulWaiting/blob/21c54fedeeff776322020a83525fc3a9d2d2e7d1/inst/shiny/PioneerWatchfulWaitingExplorer/server.R#L60-L66

Debugging further to find a solution.

MaximMoinat commented 3 years ago

PR #95 fixes an issue on the shiny side. However, there is another issue with duplicate covariateIds. See following screenshot, where e.g. id 462874664123 is in there twice, with a different name. They should have different windowId's and therefore a different last digit.

It seems like databases have conflicting sets of covariateIds. Could it be one or more have run a different version of the study package?

bdemeulder commented 3 years ago

Checked in the different datasets, there are indeed two sets of results with different covariate IDs. When generating a new premerge.Rdata with only CPRD, MarketScan, OPTUM (the three from Bayer) + MAITT and TMC, there are no issues anymore.

Those seem to agree on the covariate table.

So indeed, some datasets might have run different version of the package.

bdemeulder commented 3 years ago

@MaximMoinat, @keesvanbochove: how can we check what the proper covariate table should be in the current version, so we can pinpoint which datasets have run the correct version and which should run it again?

ablack3 commented 3 years ago

The version of the results viewer on https://data.ohdsi.org/PioneerWatchfulWaiting/ still has this issue. The download button on the characterization tab fails as well.

keesvanbochove commented 3 years ago

The version of the results viewer on https://data.ohdsi.org/PioneerWatchfulWaiting/ still has this issue. The download button on the characterization tab fails as well.

@bdemeulder could you upload the new merged set to the OHDSI server so this could be fixed online as well?

bdemeulder commented 3 years ago

@keesvanbochove , I've updated the merged set and created pull request #134 on shiny deploy to trigger the update. Could you approve it?

ahijazy commented 1 year ago

Facing the same issue, could you please clarify how was the error finally solved?