StarRocks / dbt-starrocks

dbt-starrocks contains all of the code enabling dbt to work with StarRocks
16 stars 6 forks source link

StarRocks error: 1064 (HY000): Insert has filtered data in strict mode, txn_id = 4 tracking sql = select tracking_log from information_schema.load_tracking_logs where job_id=11119 #38

Open alberttwong opened 6 months ago

alberttwong commented 6 months ago

Stuck at step 5. https://docs.getdbt.com/guides/manual-install?step=5

(dbt-env) atwong@Albert-CelerData jaffle_shop % dbt run -d
17:10:53  Sending event: {'category': 'dbt', 'action': 'invocation', 'label': 'start', 'context': [<snowplow_tracker.self_describing_json.SelfDescribingJson object at 0x103a53760>, <snowplow_tracker.self_describing_json.SelfDescribingJson object at 0x1087a1490>, <snowplow_tracker.self_describing_json.SelfDescribingJson object at 0x1087a13a0>]}
17:10:53  Running with dbt=1.6.10
17:10:53  running dbt with arguments {'printer_width': '80', 'indirect_selection': 'eager', 'write_json': 'True', 'log_cache_events': 'False', 'partial_parse': 'True', 'cache_selected_only': 'False', 'warn_error': 'None', 'fail_fast': 'False', 'debug': 'True', 'log_path': '/Users/atwong/sandbox/dbt-tutorial/jaffle_shop/logs', 'version_check': 'True', 'profiles_dir': '/Users/atwong/.dbt', 'use_colors': 'True', 'use_experimental_parser': 'False', 'no_print': 'None', 'quiet': 'False', 'warn_error_options': 'WarnErrorOptions(include=[], exclude=[])', 'static_parser': 'True', 'log_format': 'default', 'introspect': 'True', 'target_path': 'None', 'invocation_command': 'dbt run -d', 'send_anonymous_usage_stats': 'True'}
17:10:53  Sending event: {'category': 'dbt', 'action': 'project_id', 'label': 'c587bbfa-b9bc-4b0c-9ce7-2451bb59989d', 'context': [<snowplow_tracker.self_describing_json.SelfDescribingJson object at 0x1086b1f70>]}
17:10:53  Sending event: {'category': 'dbt', 'action': 'adapter_info', 'label': 'c587bbfa-b9bc-4b0c-9ce7-2451bb59989d', 'context': [<snowplow_tracker.self_describing_json.SelfDescribingJson object at 0x1092b8f40>]}
17:10:53  Registered adapter: starrocks=1.6.1
17:10:53  checksum: 28908e88b83a05550f08d8d5005b031bfd1e6cf5c944d3c74469cec401f04961, vars: {}, profile: , target: , version: 1.6.10
17:10:53  Partial parsing enabled: 0 files deleted, 0 files added, 0 files changed.
17:10:53  Partial parsing enabled, no changes found, skipping parsing
17:10:53  Sending event: {'category': 'dbt', 'action': 'load_project', 'label': 'c587bbfa-b9bc-4b0c-9ce7-2451bb59989d', 'context': [<snowplow_tracker.self_describing_json.SelfDescribingJson object at 0x1094350d0>]}
17:10:53  Sending event: {'category': 'dbt', 'action': 'resource_counts', 'label': 'c587bbfa-b9bc-4b0c-9ce7-2451bb59989d', 'context': [<snowplow_tracker.self_describing_json.SelfDescribingJson object at 0x1093fbaf0>]}
17:10:53  Found 2 models, 4 tests, 0 sources, 0 exposures, 0 metrics, 338 macros, 0 groups, 0 semantic models
17:10:53  Sending event: {'category': 'dbt', 'action': 'runnable_timing', 'label': 'c587bbfa-b9bc-4b0c-9ce7-2451bb59989d', 'context': [<snowplow_tracker.self_describing_json.SelfDescribingJson object at 0x10936e100>]}
17:10:53
17:10:53  Acquiring new starrocks connection 'master'
17:10:53  Acquiring new starrocks connection 'list_schemas'
17:10:53  Using starrocks connection "list_schemas"
17:10:53  On list_schemas: /* {"app": "dbt", "dbt_version": "1.6.10", "profile_name": "jaffle_shop", "target_name": "dev", "connection_name": "list_schemas"} */
select distinct schema_name from information_schema.schemata
17:10:53  Opening a new connection, currently in state init
17:10:53  SQL status: SUCCESS 5 in 0.0 seconds
17:10:53  On list_schemas: Close
17:10:53  Re-using an available connection from the pool (formerly list_schemas, now list_None_testing)
17:10:53  Using starrocks connection "list_None_testing"
17:10:53  On list_None_testing: BEGIN
17:10:53  Opening a new connection, currently in state closed
17:10:53  SQL status: SUCCESS 0 in 0.0 seconds
17:10:53  Using starrocks connection "list_None_testing"
17:10:53  On list_None_testing: /* {"app": "dbt", "dbt_version": "1.6.10", "profile_name": "jaffle_shop", "target_name": "dev", "connection_name": "list_None_testing"} */

    select
      null as "database",
      tbl.table_name as name,
      tbl.table_schema as "schema",
      case when tbl.table_type = 'BASE TABLE' then 'table'
           when tbl.table_type = 'VIEW' and mv.table_name is null then 'view'
           when tbl.table_type = 'VIEW' and mv.table_name is not null then 'materialized_view'
           when tbl.table_type = 'SYSTEM VIEW' then 'system_view'
           else 'unknown' end as table_type
    from information_schema.tables tbl
    left join information_schema.materialized_views mv
    on tbl.TABLE_SCHEMA = mv.TABLE_SCHEMA
    and tbl.TABLE_NAME = mv.TABLE_NAME
    where tbl.table_schema = 'testing'

17:10:53  SQL status: SUCCESS 0 in 0.0 seconds
17:10:53  On list_None_testing: ROLLBACK
17:10:53  On list_None_testing: Close
17:10:53  Sending event: {'category': 'dbt', 'action': 'runnable_timing', 'label': 'c587bbfa-b9bc-4b0c-9ce7-2451bb59989d', 'context': [<snowplow_tracker.self_describing_json.SelfDescribingJson object at 0x109425c40>]}
17:10:53  Using starrocks connection "master"
17:10:53  On master: BEGIN
17:10:53  Opening a new connection, currently in state init
17:10:53  SQL status: SUCCESS 0 in 0.0 seconds
17:10:53  On master: COMMIT
17:10:53  Using starrocks connection "master"
17:10:53  On master: COMMIT
17:10:53  SQL status: SUCCESS 0 in 0.0 seconds
17:10:53  On master: Close
17:10:53  Concurrency: 1 threads (target='dev')
17:10:53
17:10:53  Began running node model.jaffle_shop.my_first_dbt_model
17:10:53  1 of 2 START sql table model testing.my_first_dbt_model ........................ [RUN]
17:10:53  Re-using an available connection from the pool (formerly list_None_testing, now model.jaffle_shop.my_first_dbt_model)
17:10:53  Began compiling node model.jaffle_shop.my_first_dbt_model
17:10:53  Writing injected SQL for node "model.jaffle_shop.my_first_dbt_model"
17:10:53  Timing info for model.jaffle_shop.my_first_dbt_model (compile): 10:10:53.865544 => 10:10:53.870432
17:10:53  Began executing node model.jaffle_shop.my_first_dbt_model
17:10:53  Opening a new connection, currently in state closed
17:10:53  Writing runtime sql for node "model.jaffle_shop.my_first_dbt_model"
17:10:53  Using starrocks connection "model.jaffle_shop.my_first_dbt_model"
17:10:53  On model.jaffle_shop.my_first_dbt_model: BEGIN
17:10:53  SQL status: SUCCESS 0 in 0.0 seconds
17:10:53  Using starrocks connection "model.jaffle_shop.my_first_dbt_model"
17:10:53  On model.jaffle_shop.my_first_dbt_model: /* {"app": "dbt", "dbt_version": "1.6.10", "profile_name": "jaffle_shop", "target_name": "dev", "node_id": "model.jaffle_shop.my_first_dbt_model"} */

  create table `testing`.`my_first_dbt_model__dbt_tmp`
    PROPERTIES (
      "replication_num" = "1"
    )
  as /*
    Welcome to your first dbt model!
    Did you know that you can also configure models directly within SQL files?
    This will override configurations stated in dbt_project.yml

    Try changing "table" to "view" below
*/

with source_data as (

    select 1 as id
    union all
    select null as id

)

select *
from source_data

/*
    Uncomment the line below to remove records with null `id` values
*/

-- where id is not null

17:10:53  starrocks adapter: StarRocks error: 1064 (HY000): Insert has filtered data in strict mode, txn_id = 4 tracking sql = select tracking_log from information_schema.load_tracking_logs where job_id=11119
17:10:53  On model.jaffle_shop.my_first_dbt_model: ROLLBACK
17:10:53  Timing info for model.jaffle_shop.my_first_dbt_model (execute): 10:10:53.870915 => 10:10:53.947047
17:10:53  On model.jaffle_shop.my_first_dbt_model: Close
17:10:53  Database Error in model my_first_dbt_model (models/example/my_first_dbt_model.sql)
  1064 (HY000): Insert has filtered data in strict mode, txn_id = 4 tracking sql = select tracking_log from information_schema.load_tracking_logs where job_id=11119
  compiled Code at target/run/jaffle_shop/models/example/my_first_dbt_model.sql
17:10:53  Sending event: {'category': 'dbt', 'action': 'run_model', 'label': 'c587bbfa-b9bc-4b0c-9ce7-2451bb59989d', 'context': [<snowplow_tracker.self_describing_json.SelfDescribingJson object at 0x10947b700>]}
17:10:53  1 of 2 ERROR creating sql table model testing.my_first_dbt_model ............... [ERROR in 0.09s]
17:10:53  Finished running node model.jaffle_shop.my_first_dbt_model
17:10:53  Began running node model.jaffle_shop.my_second_dbt_model
17:10:53  2 of 2 SKIP relation testing.my_second_dbt_model ............................... [SKIP]
17:10:53  Finished running node model.jaffle_shop.my_second_dbt_model
17:10:53  Using starrocks connection "master"
17:10:53  On master: BEGIN
17:10:53  Opening a new connection, currently in state closed
17:10:53  SQL status: SUCCESS 0 in 0.0 seconds
17:10:53  On master: COMMIT
17:10:53  Using starrocks connection "master"
17:10:53  On master: COMMIT
17:10:53  SQL status: SUCCESS 0 in 0.0 seconds
17:10:53  On master: Close
17:10:53  Connection 'master' was properly closed.
17:10:53  Connection 'model.jaffle_shop.my_first_dbt_model' was properly closed.
17:10:53
17:10:53  Finished running 1 table model, 1 view model in 0 hours 0 minutes and 0.20 seconds (0.20s).
17:10:53  Command end result
17:10:53
17:10:53  Completed with 1 error and 0 warnings:
17:10:53
17:10:53    Database Error in model my_first_dbt_model (models/example/my_first_dbt_model.sql)
  1064 (HY000): Insert has filtered data in strict mode, txn_id = 4 tracking sql = select tracking_log from information_schema.load_tracking_logs where job_id=11119
  compiled Code at target/run/jaffle_shop/models/example/my_first_dbt_model.sql
17:10:53
17:10:53  Done. PASS=0 WARN=0 ERROR=1 SKIP=1 TOTAL=2
17:10:53  Command `dbt run` failed at 10:10:53.986391 after 0.37 seconds
17:10:53  Sending event: {'category': 'dbt', 'action': 'invocation', 'label': 'end', 'context': [<snowplow_tracker.self_describing_json.SelfDescribingJson object at 0x103a53760>, <snowplow_tracker.self_describing_json.SelfDescribingJson object at 0x1092bf250>, <snowplow_tracker.self_describing_json.SelfDescribingJson object at 0x109470310>]}
17:10:53  Flushing usage events
alberttwong commented 6 months ago
mysql> select tracking_log from information_schema.load_tracking_logs where job_id=11119
    -> ;
ERROR 2013 (HY000): Lost connection to MySQL server during query
No connection. Trying to reconnect...
Connection id:    24
Current database: *** NONE ***

+-------------------------------------------------------------+
| tracking_log                                                |
+-------------------------------------------------------------+
| Error: NULL value in non-nullable column 'id'. Row: [NULL]
 |
+-------------------------------------------------------------+
1 row in set (0.05 sec)
alberttwong commented 6 months ago

based on https://docs.starrocks.io/docs/2.5/sql-reference/sql-statements/data-definition/CREATE_TABLE/, default should be able to have null values in column.

alberttwong commented 6 months ago

@lllong33 how about this issue?

lllong33 commented 6 months ago

@alberttwong I don't know SR internals, it doesn't seem to be a dbt parse issue.

drop table if exists `testing`.`my_first_dbt_model__dbt_tmp`;
create table `testing`.`my_first_dbt_model__dbt_tmp` PROPERTIES ( "replication_num" = "1") as 
--  select cast(null as int) as id -- work
--     union all
select null as id -- work
union all 
select 1 as id
-- union all 
-- select cast(null as int) as id -- not work: id must is not null
-- union all 
-- select null as id -- not work: id must is not null

image