tuva-health / tuva

Main repo including core data model, data marts, reference data, terminology, and the clinical concept library
https://thetuvaproject.com/
189 stars 49 forks source link

The `data_quality__eligibility_death_flag` check causes a comparison error when run on PostgreSQL #553

Open BuzzCutNorman opened 3 months ago

BuzzCutNorman commented 3 months ago

Describe the bug - Required The data_quality__eligibility_death_flag check causes a comparison error when run on PostgreSQL. PostgreSQL will only allow boolean to be compared with data types of boolean and str. Here is a link that explains in more detail.

https://www.postgresql.org/docs/current/datatype-boolean.html

I am also assuming that in the input layer eligibility definition that boolean mean the boolean type for the datawarehouse platform. Looking at the query for the death flag check I could have assumed incorrectly, and I should be converting the boolean columns to int of only 1 or 0.


Environment - Required


To Reproduce Steps to reproduce the behavior: I executed dbt run with claims_enabled: true set in the dbt_project.yml:


Expected behavior The data quality checks would run without encountering any PostgreSQL operator errors..


Screenshots If applicable, add screenshots to help explain your problem.


Additional context

13:45:20.781423 [debug] [Thread-4 (]: Began running node model.the_tuva_project.data_quality__eligibility_death_flag
13:45:20.781423 [info ] [Thread-4 (]: 55 of 638 START sql table model data_quality.eligibility_death_flag ............ [RUN]
13:45:20.781423 [debug] [Thread-4 (]: Re-using an available connection from the pool (formerly model.the_tuva_project.data_quality__eligibility_death_date, now model.the_tuva_project.data_quality__eligibility_death_flag)
13:45:20.781423 [debug] [Thread-4 (]: Began compiling node model.the_tuva_project.data_quality__eligibility_death_flag
13:45:20.781423 [debug] [Thread-4 (]: Writing injected SQL for node "model.the_tuva_project.data_quality__eligibility_death_flag"
13:45:20.781423 [debug] [Thread-4 (]: Began executing node model.the_tuva_project.data_quality__eligibility_death_flag
13:45:20.781423 [debug] [Thread-4 (]: Writing runtime sql for node "model.the_tuva_project.data_quality__eligibility_death_flag"
13:45:20.797048 [debug] [Thread-4 (]: Using postgres connection "model.the_tuva_project.data_quality__eligibility_death_flag"
13:45:20.797048 [debug] [Thread-4 (]: On model.the_tuva_project.data_quality__eligibility_death_flag: BEGIN
13:45:20.797048 [debug] [Thread-4 (]: Opening a new connection, currently in state closed
13:45:20.828294 [debug] [Thread-4 (]: SQL status: BEGIN in 0.037 seconds
13:45:20.828294 [debug] [Thread-4 (]: Using postgres connection "model.the_tuva_project.data_quality__eligibility_death_flag"
13:45:20.828294 [debug] [Thread-4 (]: On model.the_tuva_project.data_quality__eligibility_death_flag: /* {"app": "dbt", "dbt_version": "1.8.4", "profile_name": "meltano", "target_name": "prod", "node_id": "model.the_tuva_project.data_quality__eligibility_death_flag"} */

  create  table "datawarehouse"."data_quality"."eligibility_death_flag__dbt_tmp"

    as

  (

SELECT DISTINCT
    M.Data_SOURCE
    ,coalesce(cast(M.ENROLLMENT_START_DATE as TEXT),cast('1900-01-01' as TEXT)) AS SOURCE_DATE
    ,'ELIGIBILITY' AS TABLE_NAME
    ,'Member ID' AS DRILL_DOWN_KEY
    ,coalesce(M.Member_ID, 'NULL') as drill_down_value
    ,'ELIGIBILITY' AS CLAIM_TYPE
    ,'DEATH_FLAG' AS FIELD_NAME
    ,CASE
        WHEN M.DEATH_FLAG in (1,0) THEN 'valid'
        WHEN M.DEATH_FLAG is null then 'null'
        ELSE 'invalid'
        END AS BUCKET_NAME
    ,cast(null as TEXT) as INVALID_REASON
    ,CAST(Death_Flag as TEXT) AS FIELD_VALUE
    , '2024-07-25 19:29:16.725294+00:00' as tuva_last_run
FROM "datawarehouse"."input_layer"."eligibility" M
  );

13:45:20.859549 [debug] [Thread-4 (]: Postgres adapter: Postgres error: operator does not exist: boolean = integer
LINE 23:         WHEN M.DEATH_FLAG in (1,0) THEN 'valid'
                                   ^
HINT:  No operator matches the given name and argument types. You might need to add explicit type casts.

13:45:20.859549 [debug] [Thread-4 (]: On model.the_tuva_project.data_quality__eligibility_death_flag: ROLLBACK
13:45:20.859549 [debug] [Thread-4 (]: On model.the_tuva_project.data_quality__eligibility_death_flag: Close
13:45:20.953290 [debug] [Thread-4 (]: Database Error in model data_quality__eligibility_death_flag (models\data_quality\intermediate\atomic_checks\claims\eligibility\data_quality__eligibility_death_flag.sql)
  operator does not exist: boolean = integer
  LINE 23:         WHEN M.DEATH_FLAG in (1,0) THEN 'valid'
                                     ^
  HINT:  No operator matches the given name and argument types. You might need to add explicit type casts.
  compiled Code at C:\EL_Projects\tuva-project/.meltano/transformers/dbt/target\run\the_tuva_project\models\data_quality\intermediate\atomic_checks\claims\eligibility\data_quality__eligibility_death_flag.sql
13:45:20.953290 [error] [Thread-4 (]: 55 of 638 ERROR creating sql table model data_quality.eligibility_death_flag ... [ERROR in 0.17s]
13:45:20.953290 [debug] [Thread-4 (]: Finished running node model.the_tuva_project.data_quality__eligibility_death_flag
aneiderhiser commented 6 days ago

@BuzzCutNorman do you know if this is still an issue?

BuzzCutNorman commented 5 days ago

Just checked 0.12.2 without a work around and got the same error:

20:30:38  Finished running 1 table model, 1 project hook in 0 hours 0 minutes and 3.27 seconds (3.27s).
20:30:39
20:30:39  Completed with 1 error and 0 warnings:
20:30:39
20:30:39    Database Error in model data_quality__eligibility_death_flag (models\data_quality\dqi\intermediate\atomic_checks\claims\eligibility\data_quality__eligibility_death_flag.sql)
  operator does not exist: boolean = integer
  LINE 23:         when m.death_flag in (1,0) then 'valid'
                                     ^
  HINT:  No operator matches the given name and argument types. You might need to add explicit type casts.