cube-js / cube

📊 Cube — The Semantic Layer for Building Data Applications
https://cube.dev
Other
17.71k stars 1.75k forks source link

AWS Athena v3: "Casting a Timestamp with Time Zone to Timestamp is not supported" #5849

Open benhall-7 opened 1 year ago

benhall-7 commented 1 year ago

Describe the bug

AWS Athena v3 timestamps with TZ do not support casting into normal timestamps, resulting in an error if you run a query that uses granularity within a time dimension. The exact error message is in the title above.

To Reproduce

Steps to reproduce the behavior:

  1. Use a cube with athena as the data source and at least one time dimension present
  2. Run a query on this cube, with a time dimension + granularity

Expected behavior

The CubeJS produces compatible SQL for Athena v3 by avoiding this cast, and fetches the data.

Minimally reproducible Cube Schema

cube(`Orders`, {
  sql: `
  select 1 as id, 100 as amount, 'new' status, '2022-12-13' day
  UNION ALL
  select 2 as id, 200 as amount, 'new' status, '2022-12-14' day
  UNION ALL
  select 3 as id, 300 as amount, 'processed' status, '2022-12-15' day
  UNION ALL
  select 4 as id, 500 as amount, 'processed' status, '2022-12-16' day
  UNION ALL
  select 5 as id, 600 as amount, 'shipped' status, '2022-12-17' day
  `,
  measures: {
    count: {
      type: `count`,
    },
    totalAmount: {
      sql: `amount`,
      type: `sum`,
    },
    toRemove: {
      type: `count`,
    },
  },
  dimensions: {
    status: {
      sql: `status`,
      type: `string`,
    },
    time: {
      sql: `from_iso8601_timestamp(day)`,
      type: `time`,
  },
});

Version:

CubeJS Athena driver v^0.29.21

Athena v3. The version of Athena itself is specified in AWS by default, and is able to be changed in the workgroup settings. See https://docs.aws.amazon.com/athena/latest/ug/engine-versions-changing.html

Additional context

We encountered this issue accidentally, when an engineer changed the version of Athena to version 3 in AWS. Prior to this change, it was not directly specified which version was used, which meant it was possible that it could switch with no warning. Athena v3 uses the Trino and Presto projects internally, but oddly, even though both of those projects support such casts, Athena v3 does not. Amazon published a changelog which specifically points out this limitation in the Breaking Changes section: https://docs.aws.amazon.com/athena/latest/ug/engine-versions-reference-0003.html.

The cast which actually raises the error should be found right here: https://github.com/cube-js/cube.js/blob/a8907df035cf53aeab3775990bd9e3ae891380e3/packages/cubejs-schema-compiler/src/adapter/PrestodbQuery.js#L51-L56

I can't verify this on my own, but it was suggested to me by another coworker that all that is necessary to fix this is remove the CAST(X AS TIMESTAMP) in the code above.

github-actions[bot] commented 1 year ago

If you are interested in working on this issue, please leave a comment below and we will be happy to assign the issue to you. If this is the first time you are contributing a Pull Request to Cube.js, please check our contribution guidelines. You can also post any questions while contributing in the #contributors channel in the Cube.js Slack.