Closed aavanzyl closed 4 months ago
Hi @aavanzyl. Could you check if this problem happens in 3.1.2 which was released today? @justinpark for awareness.
@michael-s-molina I have deployed 3.1.2 as asked. The issue currently is not showing, will monitor for a couple of days and let you know. Thank you for the response on my issue.
For further information, I can see that when clicking on run, it calls the execute
API. This in turn takes a while to respond and was an empty response when broken. When things are working as expected, the execute
API responds with a json object.
I also noticed that the updated_since
API call started before the execute
had a chance to receive a response, leading me to believe that this might contribute to the issue.
Thanks @aavanzyl. Let us know if things are working after your tests so we can close the issue.
We have had 48 hours of stability with the issue not poping up. Its safe to say its solved. Thank you @michael-s-molina and team for the swift assistance. I appreciate the effort on your side.
Hi guys, unfortunately after 6 days of stability, the issue has returned. This might be related to load in some way. The behaviour explained above seems better however, I no longer see empty execute being returned
What can I provide from my side to assist in solving the issue?
@justinpark could you help?
@aavanzyl glad to see the issue closed, but curious what happened!
@rusackas so just as a basic rundown
Then the major change I made was split out the workers into their own container tasks on ECS, they where previously grouped together under one ECS service. I am now able to scale the workers based on load.
With all of the above together, we have been stable the last couple of months.
We initially followed this blog: https://www.obytes.com/blog/superset-deployment, which was honestly a bad starting point, since correcting to the right infrastructure, its been running a lot better
Bug description
When using SQL Lab to execute a query such as
SELECT * from claims.claim limit 1
, the UI showsPending
and never starts increasing the execution time.The following is logged in the browser:
Execute:
{ "query": { "changed_on": "2024-04-02T07:16:18.493787", "dbId": 1, "db": "Amazon Redshift (Prod)", "endDttm": null, "errorMessage": null, "executedSql": null, "id": "9E6eWGcfB", "queryId": 26748, "limit": 1, "limitingFactor": "QUERY", "progress": 0, "rows": null, "schema": null, "ctas": false, "serverId": 26748, "sql": "SELECT * from claims.claim limit 1\n", "sqlEditorId": "1773", "startDttm": 1712042178240.671000, "state": "pending", "tab": "Untitled Query 1", "tempSchema": null, "tempTable": null, "userId": 1, "user": "Anton van Zyl", "resultsKey": null, "trackingUrl": null, "extra": {} } }
We get multiple updated since messages back such as:
{ "result": [ { "changed_on": "2024-04-02T07:16:18.242144", "ctas": false, "db": "Amazon Redshift (Prod)", "dbId": 1, "endDttm": null, "errorMessage": null, "executedSql": null, "extra": {}, "id": "9E6eWGcfB", "limit": 1000, "limitingFactor": "UNKNOWN", "progress": 0, "queryId": 26748, "resultsKey": null, "rows": null, "schema": null, "serverId": 26748, "sql": "SELECT * from claims.claim limit 1\n", "sqlEditorId": "1773", "startDttm": "1712042178240.671000", "state": "pending", "tab": "Untitled Query 1", "tempSchema": null, "tempTable": null, "trackingUrl": null, "user": "Anton van Zyl", "userId": 1 } ] }
But never receive and
/results
call back. My current remedy is to restart the instance and cache, which seems to intermittently solve the issue. This issue only started recently, with us updating to 3.1.0 and then 3.1.1.I have checked the logs, nothing pops up as out of the ordinary.
How to reproduce the bug
Go to SQL Lab and run any query will result in the query being stuck in
Pending
Screenshots/recordings
The current UI state being shown to the end user:
Superset version
3.1.1
Python version
I don't know
Node version
I don't know
Browser
Chrome
Additional context
We are running the latest 3.1.1 superset from a docker container hosted on ECS in AWS. The instance size has 12gigs of memory and connects to a Redshift database. Superset Caching is set up with Redis 7.1.0 running on elasticache in AWS.
Checklist