cube-js / cube

📊 Cube — Universal semantic layer platform for AI, BI, spreadsheets, and embedded analytics
https://cube.dev
Other
18.01k stars 1.78k forks source link

Pre-Aggregations creates BigQuery table by default on US region #257

Open konmavrakis opened 5 years ago

konmavrakis commented 5 years ago

Describe the bug I have a BigQuery dataset that is EU based, when creating a Pre-Aggregation, the table is created in the US region, therefore, when trying to run the Pre-Aggregation query, the main dataset can't be found.

To Reproduce Steps to reproduce the behavior:

  1. Have an EU dataset
  2. Create a simple Pre-Aggregation
  3. Build with the Pre-Aggregation as a measure
  4. See error Dataset was not found in location US

Expected behavior When creating a Pre-Aggregation, the Pre-Aggregation table must be created in the same region as the provided dataset.

Version: 0.11.16

Additional context I've delete the table that was created in the US and created it with the same name in the EU and the queries worked.

paveltiunov commented 5 years ago

@konmavrakis Hey Ntinos! Thanks for posting this one! You can pass location to BigQueryDriver directly like:

const BigQueryDriver = require('@cubejs-backend/bigquery-driver');

CubejsServerCore.create({
  driverFactory: () => new BigQueryDriver({ location: 'EU' })
});

More info on how to use driverFactory here: https://cube.dev/docs/@cubejs-backend-server-core#options-reference-driver-factory

I think it makes sense to support some environment variable for it as well here: https://github.com/cube-js/cube.js/blob/master/packages/cubejs-bigquery-driver/driver/BigQueryDriver.js#L22. Let's use this issue to track this implementation.

konmavrakis commented 5 years ago

Hey @paveltiunov thanks for your response! I’ve checked the docs but never found this option! May I suggest that location should be retrieved from the main table in order to be in scope with the project? I’m suggesting this because a project might have different datasets in different locations thus manually setting the location might be a problem.

paveltiunov commented 5 years ago

@konmavrakis Unfortunately there's no such hook to check dataset region before querying. In this case you can go with multitenancy setup: https://cube.dev/docs/multitenancy-setup#multiple-db-instances-with-same-schema. This way you'd need to maintain this table list manually.