cube-js / cube

📊 Cube — The Semantic Layer for Building Data Applications
https://cube.dev
Other
17.95k stars 1.78k forks source link

Cannot pre-aggregate rolling measure with `count_distinct` #7730

Open naerymdan opened 9 months ago

naerymdan commented 9 months ago

Describe the bug Trying to pre-aggregate a rolling_window measure give the error Error: Time series queries without dateRange aren't supported at BaseTimeDimension.timeSeries...

To Reproduce Create a measure with a rolling_window with a trailing start, create a pre-aggregation for it.

Expected behavior I expect the pre-aggregation to function correctly.

Minimally reproducible Cube Schema

cube(`session`, {
  sql_table: `test_db`,
  data_source: `default`,
  measures: {
    weeklyCount: {
      sql: `id`,
      type: `count`,
      rollingWindow: {
          leading: `1 week`,
          offset: `start`,
        },
    },
  },
  dimensions: {
    created_at: {
      sql: `${CUBE}."created_at"`,
      type: `time`
    }
  },
  preAggregations: {
    main: {
      measures: [session.weeklyCount],
      timeDimension: session.created_at,
      granularity: `day`
    }
  }
});

Version: 0.34.50

Additional context Using PostgreSQL, but might be the same issue as #6813 who is using MS SQL

igorlukanin commented 7 months ago

I've tried to build the pre-aggregation from the following data model on v0.35.18 with Postgres, and it worked for me:

cube(`session`, {
  sql: `
    SELECT 1 AS id, '2024-01-01T00:00:00.000Z' AS created_at UNION ALL
    SELECT 2 AS id, '2024-01-02T00:00:00.000Z' AS created_at UNION ALL
    SELECT 3 AS id, '2024-01-03T00:00:00.000Z' AS created_at
  `,
  data_source: `default`,
  measures: {
    weeklyCount: {
      sql: `id`,
      type: `count`,
      rollingWindow: {
          leading: `1 week`,
          offset: `start`,
        },
    },
  },
  dimensions: {
    created_at: {
      sql: `${CUBE}."created_at"`,
      type: `time`
    }
  },
  preAggregations: {
    main: {
      measures: [session.weeklyCount],
      timeDimension: session.created_at,
      granularity: `day`
    }
  }
});
Screenshot 2024-04-17 at 19 19 59 Screenshot 2024-04-17 at 19 20 03

Could you please upgrade to the latest version and double-check again?

igorlukanin commented 6 months ago

Closing this issue due to inactivity and lack of a reproduction. Please feel free to reopen.

nbphuoc commented 5 months ago

I experience this issue too. Using count_distinct_approx and count works fine, but using count_distinct results in the mentioned error.

alrolorojas commented 5 months ago

Same thing for me as well!

igorlukanin commented 5 months ago

using count_distinct results in the mentioned error

OK, I was able to reproduce it with count_distinct. Also, adding partition_granularity, build_range_start, and build_range_end does not help.

I'd love to hear what @paveltiunov thinks.

igorlukanin commented 5 months ago

As a workaround, I can maybe recommend using count_distinct_approx with Postgres HLL extension: https://cube.dev/docs/product/configuration/data-sources/postgres#count_distinct_approx

nbphuoc commented 5 months ago

As a workaround, I can maybe recommend using count_distinct_approx with Postgres HLL extension: https://cube.dev/docs/product/configuration/data-sources/postgres#count_distinct_approx

I would love to use count_distinct_approx, but unfortunately, Cube.js doesn't support count_distinct_approx for Redshift pre-aggregation, even though Redshift supports HLL itself. Will this feature be added soon?

igorlukanin commented 2 months ago

@nbphuoc Please track this one: https://github.com/cube-js/cube/issues/8510

igorlukanin commented 1 month ago

Still reproduces on Cube v0.36.7, same behavior as reported:

Screenshot 2024-10-08 at 23 47 18