A fully incremental model, that transforms raw web & mobile event data generated by the Snowplow JavaScript & mobile trackers into a series of derived tables of varying levels of aggregation.
Other
13
stars
6
forks
source link
`event_counts_string_query` results in error when run contains no data #26
When a run contains no data, we would like to pass the models gracefully and assume there will be data in the next run. This fails currently when event counts are enabled because there are no column values to loop over.
Steps to reproduce
Enable event counts, do a run where there is no data in the date range.
Expected results
Models all pass but no data is written
Actual results
An error is raised
Screenshots and log output
08:01:08 ************************
08:01:08 * Running dbt deps *
08:01:08 ************************
[0m08:01:21 Running with dbt=1.6.9
[0m08:01:21 Installing snowplow/snowplow_unified
[0m08:01:22 Installed from version 0.1.2
[0m08:01:22 Updated version available: 0.2.0
[0m08:01:22 Installing snowplow/snowplow_utils
[0m08:01:22 Installed from version 0.15.2
[0m08:01:22 Updated version available: 0.16.1
[0m08:01:22 Installing dbt-labs/dbt_utils
[0m08:01:23 Installed from version 1.1.1
[0m08:01:23 Up to date!
[0m08:01:23
[0m08:01:23 Updates available for packages: ['snowplow/snowplow_unified', 'snowplow/snowplow_utils']
Update your versions in packages.yml, then run dbt deps
08:01:24 ************************
08:01:24 * Running dbt seed *
08:01:24 ************************
[0m08:01:29 Running with dbt=1.6.9
[0m08:01:31 Registered adapter: snowflake=1.6.5
[0m08:01:31 Unable to do partial parsing because saved manifest not found. Starting full parse.
[0m08:01:46 Found 19 models, 96 tests, 3 seeds, 3 operations, 17 sources, 0 exposures, 0 metrics, 780 macros, 0 groups, 0 semantic models
[0m08:01:46
[0m08:01:50
[0m08:01:50 Running 2 on-run-start hooks
[0m08:01:50 1 of 2 START hook: snowplow_unified.on-run-start.0 ............................. [RUN]
[0m08:01:50 1 of 2 OK hook: snowplow_unified.on-run-start.0 ................................ [[32mOK[0m in 0.00s]
[0m08:01:50 2 of 2 START hook: snowplow_unified.on-run-start.1 ............................. [RUN]
[0m08:01:50 2 of 2 OK hook: snowplow_unified.on-run-start.1 ................................ [[32mOK[0m in 0.00s]
[0m08:01:50
[0m08:01:50 Concurrency: 1 threads (target='prod')
[0m08:01:50
[0m08:01:50 1 of 3 START seed file atomic33_snowplow_manifest.snowplow_unified_dim_ga4_source_categories [RUN]
[0m08:01:52 1 of 3 OK loaded seed file atomic33_snowplow_manifest.snowplow_unified_dim_ga4_source_categories [[32mINSERT 819[0m in 2.08s]
[0m08:01:52 2 of 3 START seed file atomic33_snowplow_manifest.snowplow_unified_dim_geo_country_mapping [RUN]
[0m08:01:54 2 of 3 OK loaded seed file atomic33_snowplow_manifest.snowplow_unified_dim_geo_country_mapping [[32mINSERT 249[0m in 1.95s]
[0m08:01:54 3 of 3 START seed file atomic33_snowplow_manifest.snowplow_unified_dim_rfc_5646_language_mapping [RUN]
[0m08:01:56 3 of 3 OK loaded seed file atomic33_snowplow_manifest.snowplow_unified_dim_rfc_5646_language_mapping [[32mINSERT 232[0m in 2.11s]
[0m08:01:56
[0m08:01:56 Running 1 on-run-end hook
[0m08:01:56 1 of 1 START hook: snowplow_unified.on-run-end.0 ............................... [RUN]
[0m08:01:56 1 of 1 OK hook: snowplow_unified.on-run-end.0 .................................. [[32mOK[0m in 0.00s]
[0m08:01:56
[0m08:01:56
[0m08:01:56 Finished running 3 seeds, 3 hooks in 0 hours 0 minutes and 9.73 seconds (9.73s).
[0m08:01:56
[0m08:01:56 [32mCompleted successfully[0m
[0m08:01:56
[0m08:01:56 Done. PASS=3 WARN=0 ERROR=0 SKIP=0 TOTAL=3
08:01:57 ************************
08:01:57 * Running dbt snapshot *
08:01:57 ************************
08:01:59 Running with dbt=1.6.9
08:02:00 Registered adapter: snowflake=1.6.5
08:02:01 Found 19 models, 96 tests, 3 seeds, 3 operations, 17 sources, 0 exposures, 0 metrics, 780 macros, 0 groups, 0 semantic models
08:02:01
08:02:01 Nothing to do. Try checking your model configs and model specification args
08:02:01 ************************
08:02:01 * Running dbt run *
08:02:01 ************************
08:02:04 Running with dbt=1.6.9
08:02:05 Registered adapter: snowflake=1.6.5
08:02:05 Found 19 models, 96 tests, 3 seeds, 3 operations, 17 sources, 0 exposures, 0 metrics, 780 macros, 0 groups, 0 semantic models
08:02:05
08:02:09
08:02:09 Running 2 on-run-start hooks
08:02:09 1 of 2 START hook: snowplow_unified.on-run-start.0 ............................. [RUN]
08:02:09 1 of 2 OK hook: snowplow_unified.on-run-start.0 ................................ [OK in 0.00s]
08:02:09 2 of 2 START hook: snowplow_unified.on-run-start.1 ............................. [RUN]
08:02:09 2 of 2 OK hook: snowplow_unified.on-run-start.1 ................................ [OK in 0.00s]
08:02:09
08:02:09 Concurrency: 1 threads (target='prod')
08:02:09
08:02:09 1 of 19 START sql incremental model atomic33_snowplow_manifest.snowplow_unified_base_quarantined_sessions [RUN]
08:02:11 1 of 19 OK created sql incremental model atomic33_snowplow_manifest.snowplow_unified_base_quarantined_sessions [SUCCESS 0 in 2.42s]
08:02:11 2 of 19 START sql incremental model atomic33_snowplow_manifest.snowplow_unified_incremental_manifest [RUN]
08:02:14 2 of 19 OK created sql incremental model atomic33_snowplow_manifest.snowplow_unified_incremental_manifest [SUCCESS 0 in 2.34s]
08:02:14 3 of 19 START sql table model atomic33_scratch.snowplow_unified_base_new_event_limits [RUN]
08:02:14 08:02:14 + Snowplow: No data in manifest. Processing data from start_date
08:02:16 08:02:16 + Snowplow: Processing data between '2023-11-08 00:00:00' and '2023-12-08 00:00:00' (snowplow_unified)
08:02:16 3 of 19 OK created sql table model atomic33_scratch.snowplow_unified_base_new_event_limits [SUCCESS 1 in 2.56s]
08:02:16 4 of 19 START sql incremental model atomic33_snowplow_manifest.snowplow_unified_base_sessions_lifecycle_manifest [RUN]
08:02:18 4 of 19 OK created sql incremental model atomic33_snowplow_manifest.snowplow_unified_base_sessions_lifecycle_manifest [SUCCESS 0 in 1.87s]
08:02:18 5 of 19 START sql table model atomic33_scratch.snowplow_unified_base_sessions_this_run [RUN]
08:02:19 5 of 19 OK created sql table model atomic33_scratch.snowplow_unified_base_sessions_this_run [SUCCESS 1 in 1.28s]
08:02:19 6 of 19 START sql table model atomic33_scratch.snowplow_unified_base_events_this_run [RUN]
08:02:20 08:02:20 + Snowplow Warning: *************
08:02:20 Snowplow Warning: No data in snowplow33.atomic33_scratch.snowplow_unified_base_events_this_run for date range from variables, please modify your run variables to include data if this is not expected.
08:02:20 08:02:20 + Snowplow Warning: *************
08:02:21 6 of 19 OK created sql table model atomic33_scratch.snowplow_unified_base_events_this_run [SUCCESS 1 in 1.96s]
08:02:21 7 of 19 START sql table model atomic33_scratch.snowplow_unified_events_this_run [RUN]
08:02:23 7 of 19 OK created sql table model atomic33_scratch.snowplow_unified_events_this_run [SUCCESS 1 in 1.33s]
08:02:23 8 of 19 START sql table model atomic33_scratch.snowplow_unified_pv_engaged_time [RUN]
08:02:24 8 of 19 OK created sql table model atomic33_scratch.snowplow_unified_pv_engaged_time [SUCCESS 1 in 1.34s]
08:02:24 9 of 19 START sql table model atomic33_scratch.snowplow_unified_pv_scroll_depth [RUN]
08:02:25 9 of 19 OK created sql table model atomic33_scratch.snowplow_unified_pv_scroll_depth [SUCCESS 1 in 0.67s]
08:02:25 10 of 19 START sql table model atomic33_scratch.snowplow_unified_sessions_this_run [RUN]
08:02:25 10 of 19 ERROR creating sql table model atomic33_scratch.snowplow_unified_sessions_this_run [ERROR in 0.50s]
08:02:25 11 of 19 START sql incremental model atomic33_derived.snowplow_unified_user_mapping [RUN]
08:02:27 11 of 19 OK created sql incremental model atomic33_derived.snowplow_unified_user_mapping [SUCCESS 0 in 2.13s]
08:02:27 12 of 19 START sql table model atomic33_scratch.snowplow_unified_views_this_run [RUN]
08:02:29 12 of 19 OK created sql table model atomic33_scratch.snowplow_unified_views_this_run [SUCCESS 1 in 2.16s]
08:02:29 13 of 19 SKIP relation atomic33_derived.snowplow_unified_sessions .............. [SKIP]
08:02:29 14 of 19 START sql incremental model atomic33_derived.snowplow_unified_views ... [RUN]
08:02:32 14 of 19 OK created sql incremental model atomic33_derived.snowplow_unified_views [SUCCESS 0 in 2.51s]
08:02:32 15 of 19 SKIP relation atomic33_scratch.snowplow_unified_users_sessions_this_run [SKIP]
08:02:32 16 of 19 SKIP relation atomic33_scratch.snowplow_unified_users_aggs ............ [SKIP]
08:02:32 17 of 19 SKIP relation atomic33_scratch.snowplow_unified_users_lasts ........... [SKIP]
08:02:32 18 of 19 SKIP relation atomic33_scratch.snowplow_unified_users_this_run ........ [SKIP]
08:02:32 19 of 19 SKIP relation atomic33_derived.snowplow_unified_users ................. [SKIP]
08:02:32
08:02:32 Running 1 on-run-end hook
08:02:32 1 of 1 START hook: snowplow_unified.on-run-end.0 ............................... [RUN]
08:02:33 1 of 1 OK hook: snowplow_unified.on-run-end.0 .................................. [SUCCESS 0 in 1.04s]
08:02:33
08:02:33
08:02:33 Finished running 7 incremental models, 12 table models, 3 hooks in 0 hours 0 minutes and 28.17 seconds (28.17s).
08:02:33
08:02:33 Completed with 1 error and 0 warnings:
08:02:33
08:02:33 Compilation Error in model snowplow_unified_sessions_this_run (models/sessions/scratch/snowplow_unified_sessions_this_run.sql)
'NoneType' object is not iterable
> in macro event_counts_string_query (macros/field_definitions/event_counts_string_query.sql)
> called by model snowplow_unified_sessions_this_run (models/sessions/scratch/snowplow_unified_sessions_this_run.sql)
08:02:33
08:02:33 Done. PASS=12 WARN=0 ERROR=1 SKIP=6 TOTAL=19
Describe the bug
When a run contains no data, we would like to pass the models gracefully and assume there will be data in the next run. This fails currently when event counts are enabled because there are no column values to loop over.
Steps to reproduce
Enable event counts, do a run where there is no data in the date range.
Expected results
Models all pass but no data is written
Actual results
An error is raised
Screenshots and log output
System information
The contents of your
packages.yml
file:Which database are you using dbt with?
The output of
dbt --version
:The operating system you're using:
The output of
python --version
:Additional context
This is because the get column values macro returns none when it is empty, we should specify a default instead to make it an empty list https://github.com/dbt-labs/dbt-utils/blob/main/macros/sql/get_column_values.sql
Are you interested in contributing towards the fix?