ohdsi-studies / Covid19SubjectsAesiIncidenceRate

Extending on our previous work by Li et al. in understanding the incidence rates of adverse events of special interest (AESI) for COVID-19, this work will look at the rates of these AESIs in patients who had COVDI-19 disease.
https://ohdsi-studies.github.io/Covid19SubjectsAesiIncidenceRate/Protocol.html
2 stars 4 forks source link

SqlRender Compliance: Could not cast literal "9/9/1999" to type DATE #10

Closed ericaVoss closed 2 years ago

ericaVoss commented 2 years ago

Found some non-SqlRender Compliant code

DBMS:
bigquery

Error:
java.sql.SQLException: [Simba][BigQueryJDBCDriver](100032) Error executing query job. Message: Could not cast literal "9/9/1999" to type DATE at [12:16]

SQL:
--HINT DISTRIBUTE_ON_KEY(subject_id)
CREATE TABLE ca6nb3cgttar
 AS
SELECT
tc1.cohort_definition_id,
      tar1.time_at_risk_id,
        subject_id,
        case when tar1.time_at_risk_start_index = 0 and DATE_ADD(IF(SAFE_CAST(tc1.cohort_start_date  AS DATE) IS NULL,PARSE_DATE('%Y%m%d', cast(tc1.cohort_start_date  AS STRING)),SAFE_CAST(tc1.cohort_start_date  AS DATE)), interval tar1.time_at_risk_start_offset DAY) < op1.observation_period_end_date then DATE_ADD(IF(SAFE_CAST(tc1.cohort_start_date  AS DATE) IS NULL,PARSE_DATE('%Y%m%d', cast(tc1.cohort_start_date  AS STRING)),SAFE_CAST(tc1.cohort_start_date  AS DATE)), interval tar1.time_at_risk_start_offset DAY)
           when tar1.time_at_risk_start_index = 0 and DATE_ADD(IF(SAFE_CAST(tc1.cohort_start_date  AS DATE) IS NULL,PARSE_DATE('%Y%m%d', cast(tc1.cohort_start_date  AS STRING)),SAFE_CAST(tc1.cohort_start_date  AS DATE)), interval tar1.time_at_risk_start_offset DAY) >= op1.observation_period_end_date then op1.observation_period_end_date
          when tar1.time_at_risk_start_index = 1 and DATE_ADD(IF(SAFE_CAST(tc1.cohort_end_date  AS DATE) IS NULL,PARSE_DATE('%Y%m%d', cast(tc1.cohort_end_date  AS STRING)),SAFE_CAST(tc1.cohort_end_date  AS DATE)), interval tar1.time_at_risk_start_offset DAY) < op1.observation_period_end_date then DATE_ADD(IF(SAFE_CAST(tc1.cohort_end_date  AS DATE) IS NULL,PARSE_DATE('%Y%m%d', cast(tc1.cohort_end_date  AS STRING)),SAFE_CAST(tc1.cohort_end_date  AS DATE)), interval tar1.time_at_risk_start_offset DAY)
          when tar1.time_at_risk_start_index = 1 and DATE_ADD(IF(SAFE_CAST(tc1.cohort_end_date  AS DATE) IS NULL,PARSE_DATE('%Y%m%d', cast(tc1.cohort_end_date  AS STRING)),SAFE_CAST(tc1.cohort_end_date  AS DATE)), interval tar1.time_at_risk_start_offset DAY) >= op1.observation_period_end_date then op1.observation_period_end_date
          else '9/9/1999' --tc1.cohort_start_date --shouldnt get here if tar set properly
          end as start_date,
         case when tar1.time_at_risk_end_index = 0 and DATE_ADD(IF(SAFE_CAST(tc1.cohort_start_date  AS DATE) IS NULL,PARSE_DATE('%Y%m%d', cast(tc1.cohort_start_date  AS STRING)),SAFE_CAST(tc1.cohort_start_date  AS DATE)), interval tar1.time_at_risk_end_offset DAY) < op1.observation_period_end_date then DATE_ADD(IF(SAFE_CAST(tc1.cohort_start_date  AS DATE) IS NULL,PARSE_DATE('%Y%m%d', cast(tc1.cohort_start_date  AS STRING)),SAFE_CAST(tc1.cohort_start_date  AS DATE)), interval tar1.time_at_risk_end_offset DAY)
           when tar1.time_at_risk_end_index = 0 and DATE_ADD(IF(SAFE_CAST(tc1.cohort_start_date  AS DATE) IS NULL,PARSE_DATE('%Y%m%d', cast(tc1.cohort_start_date  AS STRING)),SAFE_CAST(tc1.cohort_start_date  AS DATE)), interval tar1.time_at_risk_end_offset DAY) >= op1.observation_period_end_date then op1.observation_period_end_date
          when tar1.time_at_risk_end_index = 1 and DATE_ADD(IF(SAFE_CAST(tc1.cohort_end_date  AS DATE) IS NULL,PARSE_DATE('%Y%m%d', cast(tc1.cohort_end_date  AS STRING)),SAFE_CAST(tc1.cohort_end_date  AS DATE)), interval tar1.time_at_risk_end_offset DAY) < op1.observation_period_end_date then DATE_ADD(IF(SAFE_CAST(tc1.cohort_end_date  AS DATE) IS NULL,PARSE_DATE('%Y%m%d', cast(tc1.cohort_end_date  AS STRING)),SAFE_CAST(tc1.cohort_end_date  AS DATE)), interval tar1.time_at_risk_end_offset DAY)
          when tar1.time_at_risk_end_index = 1 and DATE_ADD(IF(SAFE_CAST(tc1.cohort_end_date  AS DATE) IS NULL,PARSE_DATE('%Y%m%d', cast(tc1.cohort_end_date  AS STRING)),SAFE_CAST(tc1.cohort_end_date  AS DATE)), interval tar1.time_at_risk_end_offset DAY) >= op1.observation_period_end_date then op1.observation_period_end_date
          else '9/9/1999' --tc1.cohort_end_date --shouldnt get here if tar set properly
          end as end_date

FROM
(select * from ohdsi_covid19_ae_si.aesi_time_at_risk where time_at_risk_id in (1,2,3,4,5,6,7)) tar1,
 (select * from ohdsi_covid19_ae_si.aesi_target where cohort_definition_id in (562,563,565,566)) tc1
inner join observation_period op1
  on tc1.subject_id = op1.person_id
  and tc1.cohort_start_date >= op1.observation_period_start_date
  and tc1.cohort_start_date <= op1.observation_period_end_date

This has been addressed in prior versions of this package: https://github.com/ohdsi-studies/Covid19VaccineAesiIncidenceCharacterization/issues/11 solution involved using DATEFROMPARTS

ericaVoss commented 2 years ago

@schuemie - I was thinking about it, is there a way I would have known this wouldn’t work? I think ‘9/9/1999’ is MS SQL Server friendly but then SqlRender isn’t catching it to cast, and not sure how something like that could even be caught. In this case is the developer just responsible for knowing this?

schuemie commented 2 years ago

Per the SqlRender documentation, you should avoid implicit casts.

Also, in general '9/9/1999' is a somewhat ambiguous format (although in this case it doesn't matter you Americans like to use the wrong order of days and months. ;-) ) Better to always use 'YYYY-MM-DD'. that last bit isn't documented anywhere. Using DATEFROMPARTS() would be completely unambiguous.