cube-js / cube

📊 Cube — The Semantic Layer for Building Data Applications
https://cube.dev
Other
17.8k stars 1.76k forks source link

Loading database schema from Snowflake fails in some cases #7440

Open brettwatkins-Built opened 10 months ago

brettwatkins-Built commented 10 months ago

Describe the bug After successfully authenticating to Snowflake and running the initial information schema query, Cube returns 'Undefined' for our tables in the Data Model section. I can see initial information schema query successful runs, but I never get the results back into Cube.

To Reproduce Steps to reproduce the behavior:

  1. Connect to a Snowflake instance
  2. Wait for initial response
  3. See error in Data Model

Expected behavior I would expect to see our Snowflake tables

Screenshots

Screenshot 2023-11-20 at 3 13 33 PM

Version: v0.34.23

zbakkernomi commented 3 months ago

having the same issue

snowflake version: 8.21.1 cube version: v0.35.45

christianallred-nomi commented 1 month ago

Do we have a solution or an update on this? its been almost 10 months now and no resposne?

igorlukanin commented 1 month ago

Hi @christianallred-nomi 👋 Are you experiencing the same issue? Which Snowflake and Cube versions do you use?

christianallred-nomi commented 1 month ago

@igorlukanin yes i ran into the same issue.

I was using the off the shelf playground, assuming every tag of cubjs/cube gets pushed to latest it would have been cube: cubejs/cube:latest (0.35.70) <- assuming this. snowflake: 8.31.1

from the cube container

{
  "name": "@cubejs-backend/docker",
  "version": "0.35.66",
  "description": "Cube.js In Docker (virtual package)",
  "author": "Cube Dev, Inc.",
  "license": "Apache-2.0",
  "private": true,
  "engines": {
    "node": "^14.0.0 || ^16.0.0 || >=17.0.0"
  },
  "dependencies": {
    "@cubejs-backend/athena-driver": "^0.35.66",
    "@cubejs-backend/bigquery-driver": "^0.35.66",
    "@cubejs-backend/clickhouse-driver": "^0.35.66",
    "@cubejs-backend/crate-driver": "^0.35.66",
    "@cubejs-backend/databricks-jdbc-driver": "^0.35.66",
    "@cubejs-backend/dbt-schema-extension": "^0.35.66",
    "@cubejs-backend/dremio-driver": "^0.35.66",
    "@cubejs-backend/druid-driver": "^0.35.66",
    "@cubejs-backend/duckdb-driver": "^0.35.66",
    "@cubejs-backend/elasticsearch-driver": "^0.35.64",
    "@cubejs-backend/firebolt-driver": "^0.35.66",
    "@cubejs-backend/hive-driver": "^0.35.63",
    "@cubejs-backend/ksql-driver": "^0.35.66",
    "@cubejs-backend/materialize-driver": "^0.35.66",
    "@cubejs-backend/mongobi-driver": "^0.35.64",
    "@cubejs-backend/mssql-driver": "^0.35.63",
    "@cubejs-backend/mysql-driver": "^0.35.66",
    "@cubejs-backend/oracle-driver": "^0.35.63",
    "@cubejs-backend/postgres-driver": "^0.35.66",
    "@cubejs-backend/prestodb-driver": "^0.35.64",
    "@cubejs-backend/questdb-driver": "^0.35.66",
    "@cubejs-backend/redshift-driver": "^0.35.66",
    "@cubejs-backend/server": "^0.35.66",
    "@cubejs-backend/snowflake-driver": "^0.35.63",
    "@cubejs-backend/sqlite-driver": "^0.35.63",
    "@cubejs-backend/trino-driver": "^0.35.66",
    "cubejs-cli": "^0.35.66",
    "typescript": "~5.2.2"
  },
  "resolutions": {
    "colors": "1.4.0"
  }
}
christianallred-nomi commented 1 month ago

it looks like you have some type of mapping problem. it looks to be getting data from something but the processing appears to fail? im going to dig in more in a bit

Screenshot 2024-08-21 at 1 51 45 PM

christianallred-nomi commented 1 month ago

i found it.

this code: https://github.com/cube-js/cube/blob/861f13e4265d95015a2d34ebeccd77a3b61fd1c8/packages/cubejs-snowflake-driver/src/SnowflakeDriver.ts#L820-L829

doesn't do what you're hoping, those as "column_name" and such dont actually lowercase the response objects. snowflake headers are coerced the capitols.

this is the response rows that come out of the this.query() in baseDriver

{
  COLUMN_NAME: 'OFFICE_CBSA',
  TABLE_NAME: 'RX_CLAIM_DENORMALIZED',
  TABLE_SCHEMA: 'ARTEMIS_CHILD',
  DATA_TYPE: 'TEXT'
}

so when you got to reduce them in informationColumnsSchemaReducer

and use these let schema = (result[i.table_schema] || {}); it fails

christianallred-nomi commented 1 month ago

This code change confirmed it. Not sure if this is an option of the snowflake api or not, but its this code that is failing and returning the undefined's

Screenshot 2024-08-22 at 11 25 27 AM

I've also confirmed this works in the playground but can't share that screenshot because i was testing on our dataset.

christianallred-nomi commented 1 month ago

i think this was a change that snowflake made recently(ish)

https://docs.snowflake.com/en/release-notes/bcr-bundles/2023_07/bcr-881

it looks like snowflake wants you to set a value to control this in your instance. As i dont think the fix for this should be imposing snowflake settings. probably the right route is to override the informationColumnsSchemaReducer inside of the snowflake driver?

Screenshot 2024-08-22 at 11 42 08 AM

looks like the sortBykeys is a private method inside of the base driver though. not sure you'd want to extract that to a util or just copy paste, given two different packages, probably just copy paste.

this does introduce a coupled dependency though.

could update the base driver func to have an option to use captial identifiers, but that feels gross as well.

I'll let yall decide on your patch, but theres your error.

christianallred-nomi commented 1 month ago

mgiht be able to set this CLIENT_RESULT_COLUMN_CASE_INSENSITIVE on the session possibly.

github-actions[bot] commented 1 month ago

If you are interested in working on this issue, please go ahead and provide PR for that. We'd be happy to review it and merge it. If this is the first time you are contributing a Pull Request to Cube, please check our contribution guidelines. You can also post any questions while contributing in the #contributors channel in the Cube Slack.

igorlukanin commented 1 month ago

Wow! Thanks for digging so deep into this @christianallred-nomi 👋

Looks like you have nailed it. Do you have any sense who is exposed to this? I've tried to reproduce this with my Snowflake instance and data, it works smoothly with the latest Cube Core.

cyl-e commented 1 month ago

Hi, thank you for the detailed analysis, it looks like we ran into the same issue.

My five cents: To me it appears like those users are affected who have set the Snowflake account parameter QUOTED_IDENTIFIERS_IGNORE_CASE = True for reasons (among others) documented here.

christianallred commented 1 month ago

Hi, thank you for the detailed analysis, it looks like we ran into the same issue.

My five cents: To me it appears like those users are affected who have set the Snowflake account parameter QUOTED_IDENTIFIERS_IGNORE_CASE = True for reasons (among others) documented here.

This is what I identified as well.

However I don’t think that this semantic api should have an opinion on sb infrastructure this should be abstracted out In the connector imp.

It is likely that your sf instance has this value set differently.

christianallred commented 1 month ago

I had a PR to fix this locally but didn’t love the solution. I can look at this again next week.

christianallred-nomi commented 3 weeks ago

I've opened a PR for this here: https://github.com/cube-js/cube/pull/8665