elementary-data / elementary

The dbt-native data observability solution for data & analytics engineers. Monitor your data pipelines in minutes. Available as self-hosted or cloud service with premium features.
https://www.elementary-data.com/
Apache License 2.0
1.94k stars 165 forks source link

0.15.1 EDR Monitor Fails with pydantic.v1.error_wrappers.ValidationError #1676

Open MichaelT950 opened 3 months ago

MichaelT950 commented 3 months ago

Describe the bug The edr monitor command fails when running with edr version 0.15.1 and dbt package 1.7.4 on Snowflake.

To Reproduce

  1. Create the environment
  2. Run command

In an environment with edr version 0.15.1 and dbt package version 1.7.4, running the following command produces error:

edr monitor --slack-webhook {REDACTED} --select statuses:warn,fail,error

2024-08-19 16:00:22 — INFO — edr (0.15.1) and Elementary's dbt package (0.15.1) are compatible.
     2024-08-19 16:00:26 — INFO — Elementary's database and schema: '"{REDACTED}"'
     2024-08-19 16:00:26 — INFO — Running internal dbt run to populate alerts
     2024-08-19 16:00:26 — INFO — Running dbt run -m elementary_cli.alerts.alerts_v2 --project-dir /usr/local/lib/python3.8/site-packages/elementary/monitor/dbt_project --vars {"days_back": 1}
     16:00:28  Running with dbt=1.7.4
     16:00:28  target not specified in profile 'elementary', using 'default'
     16:00:29  Registered adapter: snowflake=1.7.1
     16:00:29  Unable to do partial parsing because config vars, config profile, or config target have changed
     16:00:35  Found 39 models, 2 operations, 6 tests, 6 sources, 0 exposures, 0 metrics, 1395 macros, 0 groups, 0 semantic models
     16:00:35  
     16:00:36  
     16:00:36  Running 1 on-run-start hook
     16:00:36  1 of 1 START hook: elementary.on-run-start.0 ................................... [RUN]
     16:00:36  1 of 1 OK hook: elementary.on-run-start.0 ...................................... [OK in 0.00s]
     16:00:36  
     16:00:36  Concurrency: 10 threads (target='default')
     16:00:36  
     16:00:36  1 of 1 START sql incremental model {REDACTED}.alerts_v2 .................. [RUN]
     16:00:41  1 of 1 OK created sql incremental model {REDACTED}.alerts_v2 ............. [SUCCESS 0 in 4.14s]
     16:00:41  
     16:00:41  Running 1 on-run-end hook
     16:00:41  1 of 1 START hook: elementary.on-run-end.0 ..................................... [RUN]
     16:00:41  1 of 1 OK hook: elementary.on-run-end.0 ........................................ [OK in 0.00s]
     16:00:41  
     16:00:41  
     16:00:41  Finished running 1 incremental model, 2 hooks in 0 hours 0 minutes and 5.69 seconds (5.69s).
     16:00:41  
     16:00:41  Completed successfully
     16:00:41  
     16:00:41  Done. PASS=1 WARN=0 ERROR=0 SKIP=0 TOTAL=1
     2024-08-19 16:00:41 — INFO — Running dbt --log-format json run-operation elementary.log_macro_results --args {"macro_name": "elementary_cli.get_pending_alerts", "macro_args": {"days_back": 1, "type": null}} --project-dir /usr/local/lib/python3.8/site-packages/elementary/monitor/dbt_project
     Traceback (most recent call last):
       File "/usr/local/bin/edr", line 8, in <module>
         sys.exit(cli())
       File "/usr/local/lib/python3.8/site-packages/click/core.py", line 1130, in __call__
         return self.main(*args, **kwargs)
       File "/usr/local/lib/python3.8/site-packages/click/core.py", line 1055, in main
         rv = self.invoke(ctx)
       File "/usr/local/lib/python3.8/site-packages/elementary/cli/cli.py", line 67, in invoke
         return super().invoke(ctx)
       File "/usr/local/lib/python3.8/site-packages/click/core.py", line 1657, in invoke
         return _process_result(sub_ctx.command.invoke(sub_ctx))
       File "/usr/local/lib/python3.8/site-packages/click/core.py", line 1635, in invoke
         rv = super().invoke(ctx)
       File "/usr/local/lib/python3.8/site-packages/click/core.py", line 1404, in invoke
         return ctx.invoke(self.callback, **ctx.params)
       File "/usr/local/lib/python3.8/site-packages/click/core.py", line 760, in invoke
         return __callback(*args, **kwargs)
       File "/usr/local/lib/python3.8/site-packages/click/decorators.py", line 26, in new_func
         return f(get_current_context(), *args, **kwargs)
       File "/usr/local/lib/python3.8/site-packages/elementary/monitor/cli.py", line 364, in monitor
         success = data_monitoring.run_alerts(
       File "/usr/local/lib/python3.8/site-packages/elementary/monitor/data_monitoring/alerts/data_monitoring_alerts.py", line 305, in run_alerts
         alerts = self._fetch_data(days_back)
       File "/usr/local/lib/python3.8/site-packages/elementary/monitor/data_monitoring/alerts/data_monitoring_alerts.py", line 90, in _fetch_data
         return self.alerts_api.get_new_alerts(
       File "/usr/local/lib/python3.8/site-packages/elementary/monitor/api/alerts/alerts.py", line 28, in get_new_alerts
         pending_alerts = self.alerts_fetcher.query_pending_alerts(days_back=days_back)
       File "/usr/local/lib/python3.8/site-packages/elementary/monitor/fetchers/alerts/alerts.py", line 47, in query_pending_alerts
         return [
       File "/usr/local/lib/python3.8/site-packages/elementary/monitor/fetchers/alerts/alerts.py", line 48, in <listcomp>
         PendingAlertSchema(**result)
       File "/usr/local/lib/python3.8/site-packages/pydantic/v1/main.py", line 341, in __init__
         raise validation_error
     pydantic.v1.error_wrappers.ValidationError: 1 validation error for PendingAlertSchema
     __root__ -> full_refresh
       none is not an allowed value (type=type_error.none.not_allowed)
     + echo 'Unable to send slack alerts.'
     Unable to send slack alerts.

It appears this pydantic error occurs because of null values for table alerts_v2 in column SENT_AT (please see attached screenshot).

Expected behavior Expectations are that the edr monitor sends slack notifications to the web hook provided. It instead fails with error and no slack notifications are sent.

Screenshots image

Environment:

haritamar commented 3 months ago

Hi @MichaelT950 ! Based on the stack trace that you shared above, it actually looks like the problematic field is full_refresh - specifically it probably means there are lines in the dbt_run_results table where full_refresh is NULL. That's not something that's supposed to happen so do you mind sharing an example of such a row?

Thanks, Itamar

MichaelT950 commented 3 months ago

Hi haritamar,

Thanks so much for your support here - much appreciated!

Our dbt_run_results table has 14,866 rows. Of which 12,278 are null. What would cause this to occur? Running the query select distinct full_refresh from DBT_RUN_RESULTS shows only null and FALSE values.

I've attached an example row as requested: EDR.csv

ausiddiqui commented 4 weeks ago

@elongl wondering if you had any updates on this -- we discussed this issue at Coalesce. Let me know if you'd like any additional logging or other info.