OHDSI / Circe

[Under development] CIRCE is a cohort definition and syntax compiler tool for OMOP CDMv5
Apache License 2.0
5 stars 4 forks source link

Possible issue with dates when DBMS is Oracle #50

Closed rkboyce closed 8 years ago

rkboyce commented 8 years ago

Hi,

I set up the multihome WebAPI for the first time today using CDM and OHDSI schemas in our Oracle DBMS. Doing various tests with Circe and ran into a possible problem. When generating a cohort that uses any kind of specific date restriction, the generated SQL triggers ORA-01861: literal does not match format string. See a specific example below. I found this did run after wrapping dates with to_date() e.g.,

C.drug_exposure_start_date <=  C.drug_exposure_start_date <= to_date('2016-01-01','YYYY-MM-DD')

QUERY THAT FAILED:

CREATE TABLE  b804zh31Codesets
  AS
SELECT
   codeset_id, concept_id 

FROM

(SELECT   0 as codeset_id, c.concept_id  FROM  (SELECT   distinct I.concept_id  FROM 
(SELECT    DISTINCT concept_id  FROM  DIKB_DEV.CONCEPT   WHERE  concept_id in (40231804) and invalid_reason is null
    UNION 

  SELECT  c.concept_id
   FROM  DIKB_DEV.CONCEPT c
  join DIKB_DEV.CONCEPT_ANCESTOR ca on c.concept_id = ca.descendant_concept_id
  and ca.ancestor_concept_id in (40231804)
  and c.invalid_reason is null

 ) I
 ) C
 ) C
 ; CREATE TABLE  b804zh31PrimaryCriteriaEvents
   AS
SELECT
     row_number() over (order by P.person_id, P.start_date) as event_id, P.person_id, P.start_date, P.end_date, OP.observation_period_start_date as op_start_date, OP.observation_period_end_date as op_end_date

FROM

(SELECT   P.person_id, P.start_date, P.end_date, ROW_NUMBER() OVER (PARTITION BY person_id ORDER BY start_date ASC) ordinal
   FROM  
  (SELECT     C.person_id, C.drug_exposure_start_date as start_date, COALESCE(C.drug_exposure_end_date, ( C.drug_exposure_start_date +  1)) as end_date, C.drug_concept_id as TARGET_CONCEPT_ID
  FROM   
(SELECT     de.*, ROW_NUMBER() over (PARTITION BY de.person_id ORDER BY de.drug_exposure_start_date) as ordinal
    FROM   DIKB_DEV.DRUG_EXPOSURE de
  WHERE  de.drug_concept_id in (SELECT     concept_id   FROM    b804zh31Codesets   WHERE  codeset_id = 0 )
 ) C

  WHERE  C.drug_exposure_start_date <= '2016-01-01'

   ) P
 ) P
JOIN DIKB_DEV.observation_period OP on P.person_id = OP.person_id and P.start_date between OP.observation_period_start_date and op.observation_period_end_date
  WHERE  (OP.OBSERVATION_PERIOD_START_DATE + 0) <= P.START_DATE AND (P.START_DATE + 0) <= OP.OBSERVATION_PERIOD_END_DATE
 ; DELETE FROM OHDSI.cohort where cohort_definition_id = 3; INSERT INTO OHDSI.cohort (cohort_definition_id, subject_id, cohort_start_date, cohort_end_date)
SELECT   3 as cohort_definition_id, person_id as subject_id, start_date as cohort_start_date, end_date as cohort_end_date
 FROM  
(SELECT   RawEvents.*, row_number() over (partition by RawEvents.person_id order by RawEvents.start_date ASC) as ordinal
   FROM 
  (SELECT   pe.person_id, pe.start_date, pe.end_date
     FROM  b804zh31PrimaryCriteriaEvents pe

   ) RawEvents
 ) Results
chrisknoll commented 8 years ago

Looks like some type of casting needs to be added that will make sqlRender translate it properly to Oracle.

@schuemie , What's the appropriate syntax for SqlRender that will translate to to_date() (or equivalent for Oracle?)

Thanks!

schuemie commented 8 years ago

Ughhh, yes, literal dates are a bit of a mess. First of all, I 'standardized' on this date format: 'YYYYMMDD', so '20160101' in this case.

Second, you want to use

CAST('20160101' AS DATE)

as the input to SqlRender for it to translate correctly (on Oracle) to

TO_DATE('20160101', 'yyyymmdd')

Perhaps a cleaner way would be to be explicit about the date format using DATE_FROM_PARTS:

DATE_FROM_PARTS(2016,1,1)

which will translate (on Oracle) to:

TO_DATE(TO_CHAR(2016,'0000')||'-'||TO_CHAR(1,'00')||'-'||TO_CHAR(1,'00'), 'YYYY-MM-DD')
chrisknoll commented 8 years ago

@schuemie , can we standardize on the date format yyyy-mm-dd. That's the mssql format that they adopted for using string literals, so can we just tell oracle to expect that format, and in our own 'ohdsi-sql' queries, any time we specify a standard date, we use yyyy-mm-dd format? I only suggest this beacuse if you run select CAST('20160101' AS DATE) you get 2016-01-01 as the result...so just suggesting using the native format.

But, for this problem I think your suggestion to use date_from_parts() is better in that it's more explicit, but the pain is I have to find all places where dates are used and change the sql query to use that function...but, it is the better way, so I'll make that change.

As far as the standard date format that sqlrender uses, I'll leave that entirely up to you.

-Chris

chrisknoll commented 8 years ago

Ahah! But I was clever and I put all date-to-sql translation into a common function. So this should be trivial.

@rkboyce , I will push a fix to master, and close this issue once pushed. You will need to pull latest version of WebAPI once this is done (this is a WebAPI fix, not a CIrce fix, but reporting this under CIRCE is totally fine).

-Chris