oxford-pharmacoepi / MegaStudy

4 stars 2 forks source link

IncidencePrevalence generateDenominatorCohortSet: Error with Date format MM-DD-YYYY in Snowflake #34

Closed Katrin-MLK closed 5 months ago

Katrin-MLK commented 6 months ago

Hi everyone, was anybody able to run the IncidencePrevalence package successfully on a Snowflake OMOP environment and could give advise please?

We are in the following situation:

Is there any way we could make the function accepting the different date format or hand it over as string? How have others solved it? Thanks.

tiozab commented 6 months ago

@martapineda can you check if there has been a successful run in a snowflake environment please? thank you :-)

tiozab commented 6 months ago

@edward-burn @catalamarti both functions generateDrugUtilisationCohortSet() and generateDenominatorCohortSet() use the cohortDateRange argument, for the first the data is accepted, for the second it does not. how can that be?

@Katrin-MLK how do your cdm$drug_cohorts look like? all good?

Katrin-MLK commented 6 months ago

@tiozab Yes, I believe so that it looks good and as expected. image

martapineda commented 6 months ago

@martapineda can you check if there has been a successful run in a snowflake environment please? thank you :-)

From the DP that we know they use snowflake, none of them had uploaded yet the IncidencePrevalence results.

tiozab commented 6 months ago

@Katrin-MLK I think we need to have a closer look at your cdm$drug_cohorts

do they look like ours? do the variables have the same type and does date have yyyy-mm-dd? image

Katrin-MLK commented 6 months ago

@tiozab Yes, all good until this point. It looks exactly like yours. Any further suggestions what to check? image

tiozab commented 6 months ago

@edward-burn I need your help here to know what happens exactly inside the generateDenominatorCohortSet() function.

tiozab commented 5 months ago

@Katrin-MLK, thanks for the discussion, we would need you to update that there is no overlapping observation periods in the observation_period table. Furthermore, @edward-burn will open an issue on CDMConnector to see whether the date formatting mm-dd-yyyy is allowed in OMOP date fields.

Katrin-MLK commented 5 months ago

Thank you @tiozab and @edward-burn. I have checked the observation_period table. The respective database was Merative MarketScan Claims, which consists of CCAE and MDCR. Our CDM observation_period table for this particular database has an additional column INSURANCE_TYPE with flag 0=CCAE, 1=MDCR and 2=CCAE+MDCR if patient has both. Observation periods might overlap between them, but do not overlap for each of the flags. So maybe this is a limitation to exclude this database or see if we find a way to hand over this flag to filter the obs table. For all other of our databases, we do not have this additional field in the CDM and observation periods will not be overlapping. So once we have the MM-DD-YYYY date format clarified, we could theoretically proceed with all others of our databases,

Katrin-MLK commented 5 months ago

@tiozab with the following workaround we were able to overcome the MM-DD-YYYY date format challenge and run the IncidencePrevalence step successfully: Added a line to alter the default database session date format, so it will use the SQL input date format 'YYYY-MM-DD' required by the IncidencePrevalence package:

db <- dbConnect("...")

'# Set the DATE_INPUT_FORMAT session parameter DBI::dbExecute(db, "ALTER SESSION SET DATE_INPUT_FORMAT = 'YYYY-MM-DD'")

tiozab commented 5 months ago

@Katrin-MLK for which insurance type did you run it?