grafana / timestream-datasource

Amazon Timestream in Grafana
https://grafana.com/grafana/plugins/grafana-timestream-datasource
Apache License 2.0
23 stars 19 forks source link

Support long running queries in timestream #256

Open sarahzinger opened 10 months ago

sarahzinger commented 10 months ago

Discussed in https://github.com/grafana/timestream-datasource/discussions/248

Originally posted by **dleonard-nasuni** October 6, 2023 I am using the timestream-datasource, and have a query that takes a particularly large amount of time. I occasionally receive a 504 server error when I hover over the panel that has the query that is timing out, and each instance of the timeout is right around the minute mark. The logs in `/var/log/grafana/grafana.log` seem to indicate that this timeout is on the timestream-datasource plugin side due to cancelled context. Is there a way to configure this timeout? ![image](https://github.com/grafana/timestream-datasource/assets/69813119/902013a3-88f8-4de2-831a-1645512a3b7d) Here are the `/var/log/grafana/grafana.log` logs: ``` logger=context userId=4 orgId=1 uname= t=2023-10-06T20:05:45.450493766Z level=error msg="Internal server error" error="[plugin.downstreamError] failed to query data: Failed to query data: rpc error: code = Canceled desc = context canceled" remote_addr= traceID=logger=context userId=4 orgId=1 uname= t=2023-10-06T20:05:45.450601507Z level=error msg="Request Completed" method=POST path=/api/ds/query status=500 remote_addr= time_ms=60003 duration=1m0.003610261s size=116 referer="https://" handler=/api/ds/query logger=context userId=4 orgId=1 uname= t=2023-10-06T20:05:45.450601507Z level=error msg="Request Completed" method=POST path=/api/ds/query status=500 remote_addr= time_ms=60003 duration=1m0.003610261s size=116 referer="https://" handler=/api/ds/query ```
sarahzinger commented 10 months ago

@dleonard-nasuni just a heads up we're moving from discussions to issues so we can more easily track user feedback, so I'm converting your discussion into an issue. I'm curious are you still having trouble with timestream?

dleonard-nasuni commented 10 months ago

Hi @sarahzinger , thank you for opening this as an issue. I am still experiencing timeouts around the 1 minute mark for some panels that are data-intensive. I did try setting timeout in timestream.yaml in the jsonData section of my datasource configuration (after trying to figure out where timeouts are picked up in code), but timeouts still occurred after 1 minute after setting this field to 30 seconds.

iwysiu commented 10 months ago

I took a look at this and this sounds like a long running query issue. This has been handled in some of the other datasources (cloudwatch logs, athena, and redshift) by having the frontend query the backend periodically for updates, which we can and probably should do here.

Implementing that would be a decently large feature/refactor, so I'm going to move this into our backlog for now and our team will prioritize this against other work.