cube-js / cube

📊 Cube — The Semantic Layer for Building Data Applications
https://cube.dev
Other
17.94k stars 1.78k forks source link

Athena Driver Assumes Success When Queries Are In Progress #8657

Closed benjaminwootton closed 1 month ago

benjaminwootton commented 2 months ago

We are using AWS Athena with the latest version of Cube.

When we use pre-aggregations, the code seems to work by creating a table in Athena/S3 and then selecting from it prior to exporting to cubestore.

Between the CREATE and the SELECT, Cube asks Athena if the table is ready. If it is not FAILED or CANCELLED then it assume it it is SUCCESS and proceeds to the SELECT.

Unfortunately, the CREATE TABLE can also be in QUEUED or RUNNING state at Athena. The following SELECT then fails with the error below:

Error: Internal: Error during planning: Table or CTE with name 'dev_pre_aggregations.hub_agent_home_agent_abandoned_contacts_metric_main_r13zdynn_f4to0zcn_1jd0v24' not found

The affected code in Cube:

  protected async checkStatus(qid: AthenaQueryId): Promise<boolean> {
    const queryExecution = await this.athena.getQueryExecution(qid);
    const status = queryExecution.QueryExecution?.Status?.State;
    if (status === 'FAILED') {
      throw new Error(queryExecution.QueryExecution?.Status?.StateChangeReason);
    }
    if (status === 'CANCELLED') {
      throw new Error('Query has been cancelled');
    }
    return status === 'SUCCEEDED';
  }

Expected behavior

Pre aggregation logic should continue to poll whilst in QUEUED or RUNNING state for long running creation statements.

benjaminwootton commented 2 months ago

Looking at the code, I may have a workaround by setting CUBEJS_DB_POLL_TIMEOUT but I worry that will impact query times across the board and not just this interaction with Athena.

github-actions[bot] commented 2 months ago

If you are interested in working on this issue, please go ahead and provide PR for that. We'd be happy to review it and merge it. If this is the first time you are contributing a Pull Request to Cube, please check our contribution guidelines. You can also post any questions while contributing in the #contributors channel in the Cube Slack.

igorlukanin commented 2 months ago

Thanks for a great report @benjaminwootton!

paveltiunov commented 2 months ago

@benjaminwootton AFAIK this is exactly what happens in this place. If it's running checkStatus would return false and polling will continue. Could you please provide more info about this? It'd be ideal if you can reproduce it in Cube Cloud.

igorlukanin commented 1 month ago

@benjaminwootton Sorry, have you been able to reproduce this in Cube Cloud? Alternatively, if you can help reproduce this in Cube Core, that would be wonderful.

igorlukanin commented 1 month ago

Closing this due to inactivity. @benjaminwootton Please feel free to reopen!