michelin / snowflake-grafana-datasource

Snowflake Grafana datasource plugin enables the visual representation of Snowflake data within Grafana dashboards and manages alerts.
Apache License 2.0
67 stars 33 forks source link

The plugin crashes when query returns no results. #53

Closed TimShilov closed 5 months ago

TimShilov commented 9 months ago

The plugin crashes when query returns no results. Is there a way to prevent that? I'm using the plugin to monitor Snowflake tasks and when everything's good the result has no rows which causes plugin error.

Here are the logs:

Nov 17 09:40:47 affluent-hub grafana[25533]: logger=plugin.michelin-snowflake-datasource t=2023-11-17T09:40:47.751451605Z level=info msg="Query config" config="map[queryText:SELECT DATABASE_NAME, SCHEMA_NAME, NAME, STATE, ERROR_MESSAGE, MAX(SCHEDULED_TIME) AS last_seen, COUNT(*) AS count\n  FROM SNOWFLAKE.account_usage.task_history\n WHERE STATE NOT IN ('CANCELLED', 'SKIPPED', 'SUCCEEDED')\n   AND $__timeFilter(SCHEDULED_TIME)\nGROUP BY DATABASE_NAME, SCHEMA_NAME, NAME, STATE, ERROR_MESSAGE\nHAVING count > 1\nORDER BY count DESC; queryType:table timeColumns:[time]]"
Nov 17 09:40:47 affluent-hub grafana[25533]: logger=plugin.michelin-snowflake-datasource t=2023-11-17T09:40:47.751568117Z level=info msg=Query finalQuery="SELECT DATABASE_NAME, SCHEMA_NAME, NAME, STATE, ERROR_MESSAGE, MAX(SCHEDULED_TIME) AS last_seen, COUNT(*) AS count\n  FROM SNOWFLAKE.account_usage.task_history\n WHERE STATE NOT IN ('CANCELLED', 'SKIPPED', 'SUCCEEDED')\n   AND CONVERT_TIMEZONE('UTC', 'UTC', SCHEDULED_TIME) > '2023-11-17T03:40:47.514Z' AND CONVERT_TIMEZONE('UTC', 'UTC', SCHEDULED_TIME) < '2023-11-17T09:40:47.514Z'\nGROUP BY DATABASE_NAME, SCHEMA_NAME, NAME, STATE, ERROR_MESSAGE\nHAVING count > 1\nORDER BY count DESC"
Nov 17 09:40:47 affluent-hub grafana[25533]: logger=plugin.michelin-snowflake-datasource t=2023-11-17T09:40:47.753939979Z level=info msg="Query config" config="map[queryText:SELECT SCHEDULED_TIME AS date, ERROR_MESSAGE, COUNT(*) AS count\n  FROM SNOWFLAKE.account_usage.task_history\n WHERE STATE NOT IN ('CANCELLED', 'SKIPPED', 'SUCCEEDED')\n   AND $__timeFilter(scheduled_time) \nGROUP BY date, ERROR_MESSAGE\nORDER by date ; queryType:time series timeColumns:[date]]"
Nov 17 09:40:47 affluent-hub grafana[25533]: logger=plugin.michelin-snowflake-datasource t=2023-11-17T09:40:47.754032368Z level=info msg=Query finalQuery="SELECT SCHEDULED_TIME AS date, ERROR_MESSAGE, COUNT(*) AS count\n  FROM SNOWFLAKE.account_usage.task_history\n WHERE STATE NOT IN ('CANCELLED', 'SKIPPED', 'SUCCEEDED')\n   AND CONVERT_TIMEZONE('UTC', 'UTC', scheduled_time) > '2023-11-17T03:40:47.514Z' AND CONVERT_TIMEZONE('UTC', 'UTC', scheduled_time) < '2023-11-17T09:40:47.514Z' \nGROUP BY date, ERROR_MESSAGE\nORDER by date  LIMIT 1656"
Nov 17 09:40:47 affluent-hub grafana[25533]: logger=plugin.michelin-snowflake-datasource t=2023-11-17T09:40:47.819506031Z level=info msg="Query config" config="map[queryText:SELECT LAST_LOAD_TIME, FIRST_ERROR_MESSAGE, SUM(ERROR_COUNT)\n  FROM snowflake.account_usage.copy_history\n WHERE $__timeFilter(LAST_LOAD_TIME) \nGROUP BY LAST_LOAD_TIME, FIRST_ERROR_MESSAGE\nORDER BY LAST_LOAD_TIME ASC; queryType:time series timeColumns:[date]]"
Nov 17 09:40:47 affluent-hub grafana[25533]: logger=plugin.michelin-snowflake-datasource t=2023-11-17T09:40:47.819616075Z level=info msg=Query finalQuery="SELECT LAST_LOAD_TIME, FIRST_ERROR_MESSAGE, SUM(ERROR_COUNT)\n  FROM snowflake.account_usage.copy_history\n WHERE CONVERT_TIMEZONE('UTC', 'UTC', LAST_LOAD_TIME) > '2023-11-17T03:40:47.514Z' AND CONVERT_TIMEZONE('UTC', 'UTC', LAST_LOAD_TIME) < '2023-11-17T09:40:47.514Z' \nGROUP BY LAST_LOAD_TIME, FIRST_ERROR_MESSAGE\nORDER BY LAST_LOAD_TIME ASC LIMIT 1656"
Nov 17 09:40:47 affluent-hub grafana[25533]: logger=plugin.michelin-snowflake-datasource t=2023-11-17T09:40:47.991167458Z level=info msg="Query config" config="map[queryText:SELECT PIPE_CATALOG_NAME AS DATABASE_NAME, PIPE_SCHEMA_NAME AS SCHEMA_NAME, PIPE_NAME, FIRST_ERROR_MESSAGE, SUM(ERROR_COUNT) AS ERROR_COUNT\n  FROM snowflake.account_usage.copy_history\n WHERE ERROR_COUNT > 0\n   AND $__timeFilter(LAST_LOAD_TIME)\n   GROUP BY DATABASE_NAME, SCHEMA_NAME, PIPE_NAME, FIRST_ERROR_MESSAGE; queryType:table timeColumns:[time]]"
Nov 17 09:40:47 affluent-hub grafana[25533]: logger=plugin.michelin-snowflake-datasource t=2023-11-17T09:40:47.991294673Z level=info msg=Query finalQuery="SELECT PIPE_CATALOG_NAME AS DATABASE_NAME, PIPE_SCHEMA_NAME AS SCHEMA_NAME, PIPE_NAME, FIRST_ERROR_MESSAGE, SUM(ERROR_COUNT) AS ERROR_COUNT\n  FROM snowflake.account_usage.copy_history\n WHERE ERROR_COUNT > 0\n   AND CONVERT_TIMEZONE('UTC', 'UTC', LAST_LOAD_TIME) > '2023-11-17T03:40:47.514Z' AND CONVERT_TIMEZONE('UTC', 'UTC', LAST_LOAD_TIME) < '2023-11-17T09:40:47.514Z'\n   GROUP BY DATABASE_NAME, SCHEMA_NAME, PIPE_NAME, FIRST_ERROR_MESSAGE"
Nov 17 09:40:51 affluent-hub grafana[25533]: logger=plugin.michelin-snowflake-datasource t=2023-11-17T09:40:51.984634716Z level=error msg="Could not convert long frame to wide frame" err="can not convert to wide series, input fields have no rows"
Nov 17 09:40:51 affluent-hub grafana[25533]: logger=context userId=1 orgId=1 uname=admin t=2023-11-17T09:40:51.990688787Z level=error msg="Internal server error" error="[plugin.downstreamError] client: failed to query data: Failed to query data: rpc error: code = Unavailable desc = error reading from server: EOF" remote_addr=163.116.162.123 traceID=
Nov 17 09:40:51 affluent-hub grafana[25533]: logger=context userId=1 orgId=1 uname=admin t=2023-11-17T09:40:51.99078775Z level=error msg="Internal server error" error="[plugin.downstreamError] client: failed to query data: Failed to query data: rpc error: code = Unavailable desc = error reading from server: EOF" remote_addr=163.116.162.123 traceID=
Nov 17 09:40:51 affluent-hub grafana[25533]: logger=context userId=1 orgId=1 uname=admin t=2023-11-17T09:40:51.990810508Z level=error msg="Request Completed" method=POST path=/api/ds/query status=500 remote_addr=163.116.162.123 time_ms=4381 duration=4.381258906s size=116 referer="https://grafana.afflu.net/d/d4d6589b-97d0-4ff8-83c6-1a64a74501ed/snowflake-monitoring?orgId=1" handler=/api/ds/query
Nov 17 09:40:51 affluent-hub grafana[25533]: logger=context userId=1 orgId=1 uname=admin t=2023-11-17T09:40:51.990863974Z level=error msg="Request Completed" method=POST path=/api/ds/query status=500 remote_addr=163.116.162.123 time_ms=4376 duration=4.376350174s size=116 referer="https://grafana.afflu.net/d/d4d6589b-97d0-4ff8-83c6-1a64a74501ed/snowflake-monitoring?orgId=1" handler=/api/ds/query
Nov 17 09:40:51 affluent-hub grafana[25533]: logger=context userId=1 orgId=1 uname=admin t=2023-11-17T09:40:51.990688783Z level=error msg="Internal server error" error="[plugin.downstreamError] client: failed to query data: Failed to query data: rpc error: code = Unavailable desc = error reading from server: EOF" remote_addr=163.116.162.123 traceID=
Nov 17 09:40:51 affluent-hub grafana[25533]: logger=context userId=1 orgId=1 uname=admin t=2023-11-17T09:40:51.991104203Z level=error msg="Request Completed" method=POST path=/api/ds/query status=500 remote_addr=163.116.162.123 time_ms=4078 duration=4.078323286s size=116 referer="https://grafana.afflu.net/d/d4d6589b-97d0-4ff8-83c6-1a64a74501ed/snowflake-monitoring?orgId=1" handler=/api/ds/query
Nov 17 09:40:51 affluent-hub grafana[25533]: logger=context userId=1 orgId=1 uname=admin t=2023-11-17T09:40:51.99121181Z level=error msg="Internal server error" error="[plugin.downstreamError] client: failed to query data: Failed to query data: rpc error: code = Unavailable desc = error reading from server: EOF" remote_addr=163.116.162.123 traceID=
Nov 17 09:40:51 affluent-hub grafana[25533]: logger=context userId=1 orgId=1 uname=admin t=2023-11-17T09:40:51.991281956Z level=error msg="Request Completed" method=POST path=/api/ds/query status=500 remote_addr=163.116.162.123 time_ms=4174 duration=4.17457895s size=116 referer="https://grafana.afflu.net/d/d4d6589b-97d0-4ff8-83c6-1a64a74501ed/snowflake-monitoring?orgId=1" handler=/api/ds/query
Nov 17 09:40:51 affluent-hub grafana[25533]: logger=plugin.michelin-snowflake-datasource t=2023-11-17T09:40:51.992052969Z level=error msg="plugin process exited" path=/var/lib/grafana/plugins/michelin-snowflake-datasource/gpx_snowflake-datasource_linux_amd64 pid=2563 error="exit status 2"
devnied commented 8 months ago

I can't reproduce this issue with Grafana 9 and 10. Which version of Grafana are you using ?

TimShilov commented 8 months ago

Hello @devnied . I'm using a self-hosted Grafana v10.2.2 (161e3cac5075540918e3a39004f2364ad104d5bb) at the moment.

devnied commented 8 months ago

Thanks, and which version of the plugin ?

TimShilov commented 8 months ago

The plugin is 1.4.1.

surfous commented 8 months ago

@devnied, I'm experiencing this as well, but it happens specifically when I select a "Time Series" Query Type and the query returns no data. If I switch back to "Table," the error does not occur.

I'm on grafana 9.1.0 with v1.4.1 of the plugin as well - haven't tried the new v1.5.0 yet.

This seems to be the case for @TimShilov as well since I see Could not convert long frame to wide frame in his log snippet, and that error originates here in the plugin code where it handles the TimeSeries type query. Tracing back a bit further, this occurs because the plugin SDK function data.LongToWide hands back the error, "can not convert to wide series, input fields have no rows".

rumbin commented 7 months ago

Confirming the observed behavior on plugin version 1.5.0 and self-hosted Grafana 10.3.1.

rumbin commented 5 months ago

In my understanding, the error only occurs for time series which are build from narrow tables (long tables, EAV tables). It seems to be caused by the pivoting operation which tries to group be label column values but fails due to the empty result set.

WojtekWaga commented 5 months ago

Hey, the problem is that when: frame, err = data.LongToWide(frame, nil) returns err the frame is set to nil, then we're calling frame.Fields

The linked PR should resolve this issue.