cube-js / cube

📊 Cube — The Semantic Layer for Building Data Applications
https://cube.dev
Other
17.94k stars 1.78k forks source link

SQL API doesn't work with dates like '1900-01-01 00:00:00 UTC' #8057

Closed A-Lasso closed 2 months ago

A-Lasso commented 7 months ago

Describe the bug

Seems to be related to date value columns, not sure if I can describe it but I have many examples.

All the errors encountered:

Errors with logs and screenshots

First error (FIXED -temporarily- but want to show this, might help) The error:

ERROR: Unexpected panic. Reason: called Option::unwrap() on a None value

the same in this four tables:

If we do a preview we can see the data in there:

6887BBD5A70E4E4CAABF93C63820DFB2 9A78B02780694EDEB2BD157EBC914102 1A354E44819C40C4AB8B323898F9B3C3 181F5FA2F6E742FDA1DC392247DB4979

If we try to join is going to fail: Preview: BF267CA615CC409AB6C10D43BCA9A70D Question: DDC64065142F4BC8A6D154143256EEED

If we deselect this columns we can see the data in the tables:

We are able to run the table in the playground:

image (29)

Solution (for this four tables case): The columns we needed to unselect for the table to work were all date type, so we studied our data and we have some weird dates like 0020, year 20, it can be a bad input of our clients but the thing is I don't think any date should break the table like this we filtered the LocationTags column and that fixed all tables (the other three are dynamic tables created with LocationTags values, so the date values are filtered for all four tables).

image

We filtered the dates to be '1900-01-01 00:00:00 UTC' and after (timestamp from BigQuery, the tables that cube uses are in BigQuery), that fixed all the tables.

See the looms I did for Metabase team:

This same Error showed again in another question, and here is where the other errors start showing when I start doing some research/testing. That is why this videos are from the question in the table IT_Bank_Reconciliation, that question showed same error.

I recreated the errors (in that same question of IT_Bank_Reconciliation) so I can share with you the cube logs:

(visualize) image (7)

Error during processing PostgreSQL message: CubeError: Unexpected panic. Reason: called Option::unwrap() on a None value apiType sql appName Metabase v1.48.7 duration 2565 protocol postgres query { "sql": "-- Metabase:: userID: 1 queryType: MBQL queryHash: 5906fd5f62dcb1aa1a284ef505bb5f8f5d28949189c95d251fda75c998ca44a5\nSELECT \"public\".\"IT_Bank_Reconciliation\".\"Account_Manager_Verified\" AS \"Account_Manager_Verified\", \"public\".\"IT_Bank_Reconciliation\".\"Date_Completed\" AS \"Date_Completed\", \"public\".\"IT_Bank_Reconciliation\".\"Journal_Entry_Notes\" AS \"Journal_Entry_Notes\", \"public\".\"IT_Bank_Reconciliation\".\"Notes\" AS \"Notes\", \"public\".\"IT_Bank_Reconciliation\".\"Number_of_Outstanding_Checks\" AS \"Number_of_Outstanding_Checks\", \"public\".\"IT_Bank_Reconciliation\".\"Number_of_Outstanding_Deposits\" AS \"Number_of_Outstanding_Deposits\", \"public\".\"IT_Bank_Reconciliation\".\"Uploaded_Bank_Statement\" AS \"Uploaded_Bank_Statement\", \"InteractionLocations - cubeJoinField\".\"url\" AS \"InteractionLocations - cubeJoinFieldurl\", \"MGMLocations - cubeJoinField\".\"name\" AS \"MGMLocations - cubeJoinFieldname\", \"MGMLocations - cubeJoinField\".\"Account_Manager\" AS \"MGMLocations - cubeJoinFieldAccount_Manager\", \"MGMLocations - cubeJoinField\".\"Active_Status\" AS \"MGMLocations - cubeJoinFieldActive_Status\", \"MGMLocations - cubeJoinField\".\"HOA_Type\" AS \"MGMLocations - cubeJoinFieldHOA_Type\", \"MGMLocations - cubeJoinField\".\"Plan_Type\" AS \"MGMLocations - cubeJoinFieldPlan_Type\" FROM \"public\".\"IT_Bank_Reconciliation\" LEFT JOIN \"public\".\"InteractionLocations\" AS \"InteractionLocations - cubeJoinField\" ON \"public\".\"IT_Bank_Reconciliation\".\"cubeJoinField\" = \"InteractionLocations - cubeJoinField\".\"cubeJoinField\" LEFT JOIN \"public\".\"MGMLocations\" AS \"MGMLocations - cubeJoinField\" ON \"InteractionLocations - cubeJoinField\".\"cubeJoinField\" = \"MGMLocations - cubeJoinField\".\"__cubeJoinField\" WHERE \"public\".\"IT_Bank_Reconciliation\".\"Date_Completed\" IS NOT NULL LIMIT 2000" }

Second cube logs for the same error: image (9) Error during processing PostgreSQL message: CubeError: Unexpected panic. Reason: called Option::unwrap() on a None value

Error during processing PostgreSQL message: CubeError: Arrow error: External error: Internal error: Empty iterator passed to ScalarValue::iter_to_array. This was likely caused by a bug in DataFusion's code and we would welcome that you file an bug report in our issue tracker apiType sql appName Metabase v1.48.7 duration 2470 protocol postgres query { "sql": "-- Metabase:: userID: 1 queryType: MBQL queryHash: 40afdf76c37504a078858cb0fb60c64c14e52cd52a22d1c0d378fa32450f5535\nSELECT CAST(\"public\".\"IT_Bank_Reconciliation\".\"Date_Completed\" AS date) AS \"Date_Completed\", count(distinct \"public\".\"IT_Bank_Reconciliation\".\"Date_Completed\") AS \"count\" FROM \"public\".\"IT_Bank_Reconciliation\" LEFT JOIN \"public\".\"InteractionLocations\" AS \"InteractionLocations - cubeJoinField\" ON \"public\".\"IT_Bank_Reconciliation\".\"cubeJoinField\" = \"InteractionLocations - cubeJoinField\".\"cubeJoinField\" LEFT JOIN \"public\".\"MGMLocations\" AS \"MGMLocations - cubeJoinField\" ON \"InteractionLocations - cubeJoinField\".\"cubeJoinField\" = \"MGMLocations - cubeJoinField\".\"__cubeJoinField\" GROUP BY CAST(\"public\".\"IT_Bank_Reconciliation\".\"Date_Completed\" AS date) ORDER BY CAST(\"public\".\"IT_Bank_Reconciliation\".\"Date_Completed\" AS date) ASC LIMIT 10" }

Second cube logs for the same error: image (12)

Error during processing PostgreSQL message: CubeError: Arrow error: External error: Internal error: Empty iterator passed to ScalarValue::iter_to_array. This was likely caused by a bug in DataFusion's code and we would welcome that you file an bug report in our issue tracker

image (17)

image (18)

image (19)

image (20)

I was thinking that maybe it was because I can't group by a column without a measure or another column in the group by but this example works: image (13) image (14) image (15)

What I understand with this testing is: the problem is in date columns in this case.

TypeError TypeError: Cannot read properties of undefined (reading 'split') at CloudApiGateway.queryRewrite (/cube/conf/cube.js:43:47) at /cube/node_modules/@cubejs-backend/api-gateway/src/gateway.ts:1136:38 at Array.map () at CloudApiGateway.getNormalizedQueries (/cube/node_modules/@cubejs-backend/api-gateway/src/gateway.ts:1133:15) at CloudApiGateway.sqlApiLoad (/cube/node_modules/@cubejs-backend/api-gateway/src/gateway.ts:1701:20) at /cube/node_modules/@cubejs-backend/api-gateway/src/sql-server.ts:132:13 apiType sql appName Metabase v1.48.7 duration 213 protocol postgres query { "measures": [], "dimensions": [], "segments": [], "timeDimensions": [ { "dimension": "IT_Bank_Reconciliation.Date_Completed", "granularity": "day" } ], "limit": 50000 }

image (22)

Error during processing PostgreSQL message: CubeError: Arrow error: Compute error: TypeError: Cannot read properties of undefined (reading 'split') apiType sql appName Metabase v1.48.7 duration 588 protocol postgres query { "sql": "-- Metabase:: userID: 1 queryType: MBQL queryHash: 84930f23b3174bcd1d90bb8cc511c6e333ecf9b119eb03de5d4004923d0b115b\nSELECT CAST(\"public\".\"IT_Bank_Reconciliation\".\"Date_Completed\" AS date) AS \"Date_Completed\" FROM \"public\".\"IT_Bank_Reconciliation\" LEFT JOIN \"public\".\"InteractionLocations\" AS \"InteractionLocations - cubeJoinField\" ON \"public\".\"IT_Bank_Reconciliation\".\"cubeJoinField\" = \"InteractionLocations - cubeJoinField\".\"cubeJoinField\" LEFT JOIN \"public\".\"MGMLocations\" AS \"MGMLocations - cubeJoinField\" ON \"InteractionLocations - cubeJoinField\".\"cubeJoinField\" = \"MGMLocations - cubeJoinField\".\"__cubeJoinField\" GROUP BY CAST(\"public\".\"IT_Bank_Reconciliation\".\"Date_Completed\" AS date) ORDER BY CAST(\"public\".\"IT_Bank_Reconciliation\".\"Date_Completed\" AS date) ASC LIMIT 2000" }

image (23)

Error during processing PostgreSQL message: CubeError: Arrow error: Compute error: TypeError: Cannot read properties of undefined (reading 'split')

I was also working in some clients dashboards and something else happened, and even the error tells me that this is from cube side:

image (24)

ERROR: Error during rewrite: Can't detect Cube query and it may be not supported yet. Please check logs for additional information. QUERY: SELECT CAST("InteractionLocations - cubeJoinField"."createdAt" AS DATE) AS "InteractionLocations - cubeJoinFieldcreatedAt", SUM("public"."IT_Daily_IPS_CheckIn"."Denied") AS "sum", SUM("public"."IT_Daily_IPS_CheckIn"."Pending_Approval") AS "sum_2", SUM("public"."IT_Daily_IPS_CheckIn"."Emailed_for_Approval") AS "sum_3", SUM("public"."IT_Daily_IPS_CheckIn"."Duplicates") AS "sum_4", SUM("public"."IT_Daily_IPS_CheckIn"."Failed_Posts") AS "sum_5" FROM "public"."IT_Daily_IPS_CheckIn" LEFT JOIN "public"."InteractionLocations" AS "InteractionLocations - cubeJoinField" ON "public"."IT_Daily_IPS_CheckIn"."cubeJoinField" = "InteractionLocations - cubeJoinField"."cubeJoinField" LEFT JOIN "public"."MGMLocations" AS "MGMLocations - cubeJoinField" ON "InteractionLocations - cubeJoinField"."cubeJoinField" = "MGMLocations - cubeJoinField"."cubeJoinField" WHERE "public"."IT_Daily_IPS_CheckIn"."Questions" IS NOT NULL GROUP BY CAST("InteractionLocations - cubeJoinField"."createdAt" AS DATE) ORDER BY CAST("InteractionLocations - cubeJoinField"."createdAt" AS DATE) ASC LIMIT 10

image (25)

ERROR: Error during rewrite: Can't detect Cube query and it may be not supported yet. Please check logs for additional information. QUERY: SELECT CAST("InteractionLocations - cubeJoinField"."createdAt" AS DATE) AS "InteractionLocations - cubeJoinFieldcreatedAt", SUM("public"."IT_Daily_IPS_CheckIn"."Denied") AS "sum" FROM "public"."IT_Daily_IPS_CheckIn" LEFT JOIN "public"."InteractionLocations" AS "InteractionLocations - cubeJoinField" ON "public"."IT_Daily_IPS_CheckIn"."cubeJoinField" = "InteractionLocations - cubeJoinField"."cubeJoinField" LEFT JOIN "public"."MGMLocations" AS "MGMLocations - cubeJoinField" ON "InteractionLocations - cubeJoinField"."cubeJoinField" = "MGMLocations - cubeJoinField"."cubeJoinField" WHERE "public"."IT_Daily_IPS_CheckIn"."Questions" IS NOT NULL GROUP BY CAST("InteractionLocations - cubeJoinField"."createdAt" AS DATE) ORDER BY CAST("InteractionLocations - cubeJoinField"."createdAt" AS DATE) ASC LIMIT 10

image (26)

ERROR: Arrow error: Compute error: TypeError: Cannot read properties of undefined (reading 'split') Same error I showed above in this same email, and again this is a date column.

image (27) (bin or unnbinned the error is still there)

The problem seems to be in the summarize/date column: image (28)

Version:

Cube Cloud: stable (0.34.52) and Latest (0.35.5) -tried to see if it was fixed-

Metabase Cloud: 1.48.7

Additional context

I already tested with the newer version (it didn't work) but I can't be changing the stable version, it already broke our production two different times, and we couldn't get back our metabase local production on track on the second time, so now we are on metabase cloud and re-did everything we could for our clients. Hope you can understand that it is not so easy to test, because any change in cube would mean risk for this to happen again.

Don't know if this has to do but all tables with IT and LT in the beginning are dynamic tables.

Really sorry if this too long but wanted to give you guys all the context I could and all the data/examples I found so this is faster or at least has more context. Let me know if there are any more questions or something you want me to try, I think I shared all I could. Thank you for your help and taking the time to read all this text and issues, hope we can figure out what is happening!

igorlukanin commented 7 months ago

@A-Lasso Thanks for a very elaborate report! I've renamed the issue—do you think I've got the main idea correctly?

A-Lasso commented 7 months ago

Yes I think so, but I'm not sure all of them have those weird dates, so I will see in playground what I can find in this other tables that cannot be filtered by date columns.

Also I was said to add the "CUBESQL_SQL_PUSH_DOWN" env variable and set it to true, but this just gave me more errors and broke even more the summarizes and tables, so I deleted it again to at least have the dashboards that were done not broken.

As you see, the IT_Bank_Reconciliation table doesn't have any data before that year: image

image

So seems there is something else happening, if there is anything you want me to test I'll be happy to.

Also since it mentions the "split" in some errors this is what I could find with the "split" in the cube.js file.

const queryRewrite = (query, request) => {
    console.log("Rewriting for", JSON.stringify(request))
    const accountId = request.cacct !== undefined ? request.cacct : request.securityContext.cacct;

    if (query["dimensions"].length > 0) {
        var table_name = query["dimensions"][0].split(".")[0]
    }
    // measures in query
    else {
        var table_name = query["measures"][0].split(".")[0]
    }

    console.log("Using account id:", accountId);

    query.filters.push({
        member: `${table_name}.accountUuid`,
        operator: 'equals',
        values: [accountId],
    });

    return query;
}
A-Lasso commented 7 months ago

Hello I'm here to share more examples of the problem.

image image

we cannot do a summarize having a not null or not empty filtering of mostly ANY column (doesn't matter if its date or not):

the visualization without a summarize: image

with a summarize: image image image

A-Lasso commented 7 months ago

Since there is still no update, I will continue to send some examples, not sure if they are repeated or not, but as I show in an older message, this doesn't look like just a date value problem.

image

I added the env variable as this logs said and this is what I got re-doing the question on metabase:

image image image

Now I changed it to "Full" instead of 1 and this is what I got: image image image

other logs (SQL Query and normal cube error log):

image

query: { "sql": "-- Metabase:: userID: 1 queryType: MBQL queryHash: 36a8c22444227518ebbf37738bfe21864c71dbe004c10a771ca5ceb64afb87e2\nSELECT \"public\".\"IT_Bank_Reconciliation\".\"interactionUuid\" AS \"interactionUuid\", \"public\".\"IT_Bank_Reconciliation\".\"accountUuid\" AS \"accountUuid\", \"public\".\"IT_Bank_Reconciliation\".\"Account_Manager_Verified\" AS \"Account_Manager_Verified\", \"public\".\"IT_Bank_Reconciliation\".\"All_Checks_VoidedReissued\" AS \"All_Checks_VoidedReissued\", \"public\".\"IT_Bank_Reconciliation\".\"Date_Completed\" AS \"Date_Completed\", \"public\".\"IT_Bank_Reconciliation\".\"Journal_Entry_Notes\" AS \"Journal_Entry_Notes\", \"public\".\"IT_Bank_Reconciliation\".\"Notes\" AS \"Notes\", \"public\".\"IT_Bank_Reconciliation\".\"Number_of_Outstanding_Checks\" AS \"Number_of_Outstanding_Checks\", \"public\".\"IT_Bank_Reconciliation\".\"Number_of_Outstanding_Deposits\" AS \"Number_of_Outstanding_Deposits\", \"public\".\"IT_Bank_Reconciliation\".\"TalkedConfirmed_With_Vendors\" AS \"TalkedConfirmed_With_Vendors\", \"public\".\"IT_Bank_Reconciliation\".\"Uploaded_Bank_Statement\" AS \"Uploaded_Bank_Statement\", \"public\".\"IT_Bank_Reconciliation\".\"user\" AS \"user\", \"public\".\"IT_Bank_Reconciliation\".\"cubeJoinField\" AS \"cubeJoinField\" FROM \"public\".\"IT_Bank_Reconciliation\" WHERE \"public\".\"IT_Bank_Reconciliation\".\"Account_Manager_Verified\" IS NOT NULL LIMIT 2000" }

image

Metabase question generating this error: image image image image

Thank you again for the help.

igorlukanin commented 7 months ago

@A-Lasso Thanks for the details.

Also I was said to add the "CUBESQL_SQL_PUSH_DOWN" env variable and set it to true, but this just gave me more errors and broke even more the summarizes and tables, so I deleted it again to at least have the dashboards that were done not broken.

I would recommend to upgrade to the latest version and give this option a try again. This is going to be the default mode for the SQL API in the coming weeks/months, and I feel like making it work with Metabase is the right approach going forward.

A-Lasso commented 7 months ago

Hello again, seems like we had code errors from our side, so I think the issue is primarily what the title says.

Sorry for all the bother and thank you so much for the help.

HaidarZ commented 2 months ago

Please test on the latest version cubejs/cube:v0.35.78 , it seems that the problem is fixed on this specific release.

HaidarZ commented 2 months ago

https://github.com/cube-js/cube/pull/8583

igorlukanin commented 2 months ago

Closing this one as resolved—please feel free to reopen if this is still an issue.