cube-js / cube

📊 Cube — The Semantic Layer for Building Data Applications
https://cube.dev
Other
17.95k stars 1.78k forks source link

Redis parser inconsistencies: 'Unbalanced Quotes in request', 'unknown command `$52685`' #2589

Closed fabiosenracorrea closed 2 years ago

fabiosenracorrea commented 3 years ago

Describe the bug I've been facing some really inconsistent bugs with redis-parser lately on our application. The worst part is they seem to be happening at random, with no clear path to get it to reproduce.

We have a docker instance of cube running connected to Athena and 2 mysql databases (one for data, the other for pre-aggregations). We just launched our pre-aggregation patch, and are now pre-creating all tables before any user can access the application. We have a batch of creation running at a time, which consists of 180 queries that get requested to our cube server to trigger each pre-aggregation to be created. Each request is spaced out by 3 seconds before it attempts to check again if the table was created.

The problem is that, every now and then a request fails with the ERR Unbalanced Quotes in request, sample error:

{
    "message": "Error while querying",
    "processingId": "687",
    "queueSize": 1,
    "duration": 18373,
    "queryKey": [
        [
            "CREATE TABLE analytics_pre_aggregations.panelist_joins_by_country20200901 AS SELECT\n      \"XXXX\".country \"XXXX__country\", CASE\n    WHEN \"XXXX\".recruit_source = 1 THEN 'BigSpot'\nWHEN \"XXXX\".recruit_source = 2 THEN 'ss'\nWHEN \"XXXX\".recruit_source = 3 THEN 'snag-indeed'\nWHEN \"XXXX\".recruit_source = 4 THEN 'sh'\nWHEN \"XXXX\".recruit_source = 5 THEN 'bem'\nWHEN \"XXXX\".recruit_source = 6 THEN 'Google'\nWHEN \"XXXX\".recruit_source = 7 THEN 'Msn'\nWHEN \"XXXX\".recruit_source = 8 THEN 'TV'\nWHEN \"XXXX\".recruit_source = 9 THEN 'Facebook'\nWHEN \"XXXX\".recruit_source = 11 THEN 'Youtube'\nWHEN \"XXXX\".recruit_source = 12 THEN 'gdisplay'\nWHEN \"XXXX\".recruit_source = 13 THEN 'gsearch'\nWHEN \"XXXX\".recruit_source = 14 THEN 'fandf'\nWHEN \"XXXX\".recruit_source = 15 THEN 'Twitter'\nWHEN \"XXXX\".recruit_source = 98 THEN 'Android'\nWHEN \"XXXX\".recruit_source = 99 THEN 'iOS' ELSE 'Unknown' END \"XXXX__recruit_source_name\", \"XXXX\".country NOT IN (21, 26, 27, 28) \"panelist__pre_aggregated\", date_trunc('hour', CAST(date_add('minute', timezone_minute(\"XXXX\".joindate AT TIME ZONE 'America/La_Paz'), date_add('hour', timezone_hour(\"XXXX\".joindate AT TIME ZONE 'America/La_Paz'), \"XXXX\".joindate AT TIME ZONE 'America/La_Paz')) AS TIMESTAMP)) \"XXXX__join_date_hour\", count(\"XXXX\".id) \"XXXX__count\"\n    FROM\n      ows_production.XXXX AS \"XXXX\"  WHERE (\"XXXX\".joindate >= from_iso8601_timestamp(?) AND \"XXXX\".joindate <= from_iso8601_timestamp(?)) GROUP BY 1, 2, 3, 4",
            [
                "2020-09-01T04:00:00Z",
                "2020-10-01T03:59:59Z"
            ]
        ],
        [
            [
                {}
            ]
        ]
    ],
    "queuePrefix": "SQL_PRE_AGGREGATIONS_STANDALONE_default",
    "requestId": "2323e704-7d37-4332-b177-2f73db3cc1e6",
    "timeInQueue": 4,
    "error": "ReplyError: ERR Protocol error: unbalanced quotes in request\n    at parseError (/cube/node_modules/redis-parser/lib/parser.js:179:12)\n    at parseType (/cube/node_modules/redis-parser/lib/parser.js:302:14)"

Which then cause the query to that table to fail with the same error, returning an orchestrator error.

{
    "message": "Orchestrator error",
    "query": {
        "measures": [
            "XXXX.count"
        ],
        "dimensions": [],
        "segments": [],
        "timeDimensions": [
            {
                "dimension": "XXXX.joinDate",
                "granularity": "day",
                "dateRange": [
                    "2020-09-01T00:00:00.000",
                    "2020-09-01T23:59:59.999"
                ]
            }
        ],
        "filters": [
            {
                "dimension": "XXXX.country",
                "operator": "equals",
                "values": [
                    "2"
                ]
            }
        ],
        "timezone": "America/La_Paz"
    },
    "error": "Error: ERR Protocol error: unbalanced quotes in request",
    "duration": 1382,
    "securityContext": {},
    "requestId": "a4afaa3c-b246-4fbb-96fc-f4a6e7960b57"
}

And while this is annoying, the good thing is that the pre-agg table is still created and the next time we query for that data everything runs well. But i've already seen this same quote error happen with a regular query from our application, which causes the graph to display an error, leaving the user with no choice other than reloading the page to get it to work. Not ideal!

The second redis-related bug i've found is a little more worring, as it causes my application to completely crash and restart, which takes it offline for ~5 minutes. This error happens on a query the same way the above and it goes like:

{
    "message": "Error while querying",
    "processingId": "691",
    "queueSize": 1,
    "duration": 58061,
    "queryKey": [
        [
            "CREATE TABLE analytics_pre_aggregations.XXXXX_by_country__gender20201001 AS SELECT\n      \"XXXXX\".country \"XXXXX__country\", CASE\n    WHEN \"XXXXX\".gender = 1 THEN 'Men'\nWHEN \"XXXXX\".gender = 2 THEN 'Women' ELSE 'Unknown' END \"XXXXX__gender_names\", CASE\n    WHEN date(\"XXXXX\".joindate) = date(\"XXXXX_project\".date_start) THEN 'New' ELSE 'Old' END \"XXXXX__old_or_new\", CASE\n    WHEN \"XXXXX\".recruit_source = 1 THEN 'BigSpot'\nWHEN \"XXXXX\".recruit_source = 2 THEN 'ss'\nWHEN \"XXXXX\".recruit_source = 3 THEN 'snag-indeed'\nWHEN \"XXXXX\".recruit_source = 4 THEN 'sh'\nWHEN \"XXXXX\".recruit_source = 5 THEN 'bem'\nWHEN \"XXXXX\".recruit_source = 6 THEN 'Google'\nWHEN \"XXXXX\".recruit_source = 7 THEN 'Msn'\nWHEN \"XXXXX\".recruit_source = 8 THEN 'TV'\nWHEN \"XXXXX\".recruit_source = 9 THEN 'Facebook'\nWHEN \"XXXXX\".recruit_source = 11 THEN 'Youtube'\nWHEN \"XXXXX\".recruit_source = 12 THEN 'gdisplay'\nWHEN \"XXXXX\".recruit_source = 13 THEN 'gsearch'\nWHEN \"XXXXX\".recruit_source = 14 THEN 'fandf'\nWHEN \"XXXXX\".recruit_source = 15 THEN 'Twitter'\nWHEN \"XXXXX\".recruit_source = 98 THEN 'Android'\nWHEN \"XXXXX\".recruit_source = 99 THEN 'iOS' ELSE 'Unknown' END \"XXXXX__recruit_source_name\", \"XXXXX\".country NOT IN (21, 26, 27, 28) \"XXXXX__pre_aggregated\", date_trunc('hour', CAST(date_add('minute', timezone_minute(\"XXXXX_project\".date_start AT TIME ZONE 'America/Argentina/Buenos_Aires'), date_add('hour', timezone_hour(\"XXXXX_project\".date_start AT TIME ZONE 'America/Argentina/Buenos_Aires'), \"XXXXX_project\".date_start AT TIME ZONE 'America/Argentina/Buenos_Aires')) AS TIMESTAMP)) \"XXXXX_project__date_start_hour\", count(\"XXXXX\".id) \"XXXXX__count\"\n    FROM\n      ows_production.XXXXX AS \"XXXXX\"\nLEFT JOIN ows_production.XXXXX_project AS \"XXXXX_project\" ON \"XXXXX\".id = \"XXXXX_project\".XXXXX_id  WHERE (\"XXXXX_project\".date_start >= from_iso8601_timestamp(?) AND \"XXXXX_project\".date_start <= from_iso8601_timestamp(?)) GROUP BY 1, 2, 3, 4, 5, 6",
            [
                "2020-10-01T03:00:00Z",
                "2020-11-01T02:59:59Z"
            ]
        ],
        [
            [
                {}
            ]
        ]
    ],
    "queuePrefix": "SQL_PRE_AGGREGATIONS_STANDALONE_default",
    "requestId": "53c67623-1cd4-4f8c-88ff-47d2417bd2d7",
    "timeInQueue": 3,
    "error": "ReplyError: ERR unknown command `$52685`, with args beginning with: \n    at parseError (/cube/node_modules/redis-parser/lib/parser.js:179:12)\n    at parseType (/cube/node_modules/redis-parser/lib/parser.js:302:14)"
}

Which then shows this error

image

And proceeds to crash the application.

I have no idea what could be causing these issues, as they are not consistent. Can someone help me with that?

To Reproduce

If there's any suggestions on how i could maybe illustrate this in a better way, please let me know.

Expected behavior The queries would be processed normally as it is what happens most of the time.

Version: 0.26.94

rongfengliang commented 3 years ago

try upgrade to latest version

fabiosenracorrea commented 3 years ago

@rongfengliang yeah i saw that at least the database pool error it seems to be corrected, right? Do theses problems also get fixed there? I was waiting on all pre-agg table creations to be finished before doing that, but i guess i should try that now!

fabiosenracorrea commented 3 years ago

@rongfengliang Update: bumping to latest version did not fix the unbalanced query issue. It is still happening at random to queries.

Haven't had a chance to experience the other 2 errors yet but it's still early on.

rongfengliang commented 3 years ago

@fabiosenracorrea which kind of redis you using (cloud, open source?)

rongfengliang commented 3 years ago

@fabiosenracorrea this maybe help https://github.com/redis/redis/issues/2995

fabiosenracorrea commented 3 years ago

@rongfengliang we are using aws' redis. And that issue you mentioned just says it's probably a client issue, like i'm reporting here

paveltiunov commented 3 years ago

Hey @fabiosenracorrea ! Which driver do you use? Do you use IO Redis or an ordinary one? Could you share environment variables you provide?

fabiosenracorrea commented 3 years ago

Hey @paveltiunov! We do not use IO Redis, we use a simple ElasticCache with no TLS or password, and have been getting this 'unbalanced quotes at request' really often lately. Our version is 0.26.102 as of now and use an ENV like this:

CUBEJS_JDBC_DRIVER | athena
CUBEJS_DB_TYPE | athena
CUBEJS_DEV_MODE | false
COGNITO_REGION | cog region
COGNITO_USER_POOL_ID | cog pool id
CUBEJS_AWS_KEY | our key
CUBEJS_AWS_SECRET | our secret
CUBEJS_AWS_REGION | our region
CUBEJS_AWS_S3_OUTPUT_LOCATION | s3 bucket
CUBEJS_API_SECRET | api secret
REDIS_URL | redis://XXX-XXX.use1.cache.amazonaws.com:6379
REDIS_PASSWORD |  
REDIS_TLS | false
CUBEJS_SCHEDULED_REFRESH_TIMER | true
CUBEJS_FRONTEND | our frontend url
CUBEJS_DB_HOST | my sql host
CUBEJS_DB_PORT | 3306
CUBEJS_DB_NAME | db name.
CUBEJS_DB_USER | db user
CUBEJS_DB_PASS | db pw
CUBEJS_EXT_DB_TYPE | mysql
CUBEJS_EXT_DB_NAME | db pre-agg name
CUBEJS_EXT_DB_HOST | pre-agg host
CUBEJS_EXT_DB_PORT | 3306
CUBEJS_EXT_DB_USER | pre agg user
CUBEJS_EXT_DB_PASS | pre-aggpw
rongfengliang commented 3 years ago

try add NODE_DEBUG=redis . can view redis debug info .this maybe help these days i use this debug one redis like service bug

fabiosenracorrea commented 3 years ago

@rongfengliang still not a clue! Maybe i'll change the redis instance we are using to check if that solves it

fabiosenracorrea commented 3 years ago

An update on this: It seems that changing our redis instance solved these problems (at least for now). I'm still not quite sure why it was happening but if anyone experiences this, might as well try an instance change!