cube-js / cube

📊 Cube — The Semantic Layer for Building Data Applications
https://cube.dev
Other
17.96k stars 1.78k forks source link

RollupJoin aggregation gives misleading error when querying timestamp related dimenstion #8772

Open kevinleeTCA opened 1 month ago

kevinleeTCA commented 1 month ago

Failed SQL

  1. Rollup join resolving value of timestamp works fine Screenshot 2024-10-03 at 3 01 09 pm

  2. Rollup join resolving day or year, etc of timestamp is broken with misleading error msg.

Screenshot 2024-10-03 at 3 04 40 pm Screenshot 2024-10-03 at 3 15 28 pm

To join across data sources use rollupJoin with Cube Store. If rollupJoin is defined, this error indicates it doesn't match the query. Please use Rollup Designer to verify it's definition. Found data sources: propertyManagement, relation

this error msg is really confusing as it does not help with debug at all, we encountered a lot such error msgs when using rollup join, but the root causes have nothing to do with the rollup definition mentioned in the msg.

  1. Query single cube esolving day or year, etc of timestamp is fine. Screenshot 2024-10-03 at 3 05 19 pm

Logical Plan Search for Can't rewrite plan log message.

Version: cube: 0.36.2

Screenshot 2024-10-03 at 3 15 02 pm

Additional context We have 2 cubes (from different data source, postgres database), and we have 2 pre-aggregations rollups defined respectively in each cube. We have a rollupJoin to join data between these 2 models.

The problematic managementStartDate has date type from postgres schema

PM_Teams Cube

cube(`PM_Teams`, {
  sql: ...,

  preAggregations: {
    // rollup
    pMTeamsRollup: {
      dimensions: [
        CUBE.teamId,
        CUBE.teamName,
        CUBE.organisationId,
        CUBE.legalEntityId,
        CUBE.managementAilorn,
        CUBE.propertyAddress,
        CUBE.propertyType,
        CUBE.managementId,
        CUBE.managementStartDate,
        CUBE.managementEndDate,
      ],
      indexes: {
        mgIdx: {
          columns: [CUBE.managementAilorn],
        },
      },
      refresh_key: {
        every: `1 hour`,
      }
    },

  // 'rollupJoin'
    groupTeamsRollup: {
      type: `rollupJoin`,
      measures: [],
      dimensions: [
        PM_Teams.managementAilorn,
        PM_Teams.organisationId,
        PM_Teams.legalEntityId,
        PM_Teams.propertyType,
        PM_Teams.propertyAddress,
        PM_Teams.teamId,
        PM_Teams.teamName,
        PM_Teams.managementId,
        PM_Teams.managementStartDate,
        PM_Teams.managementEndDate,
        AMGroupLinks.groupId,
        AMGroupLinks.groupName,
      ],
      rollups: [PM_Teams.pMTeamsRollup, AMGroupLinks.aMGLRollup],
      refresh_key: {
        every: `1 hour`,
      }
  }
}
...
})

AMGroupLinks cube

cube(`AMGroupLinks`, {
  sql: `...`,

  preAggregations: {
    aMGLRollup: {
      dimensions: [
        CUBE.groupId,
        CUBE.groupName,
        CUBE.managementId,
        CUBE.organisationId,
      ],
      indexes: {
        mgIdx: {
          columns: [CUBE.groupId, CUBE.managementId],
        },
      },
      refresh_key: {
        every: `1 hour`,
      }
    },
  },

  dimensions: {
    groupId: {
      sql: `id`,
      type: `string`,
      primaryKey: true,
      shown: true,
    },
    groupName: {
      sql: `name`,
      type: `string`,
    },
    managementId: {
      sql: `management_id`,
      type: `string`,
      primaryKey: true,
      shown: true,
    },
    organisationId: {
      sql: `organisation_id`,
      type: `string`,
    },
  },
 ...
});
paveltiunov commented 3 weeks ago

@kevinleeTCA I guess it's because managementStartDate referred as a dimension in the rollup definition instead of time_dimension. When it's dimension it's raw value instead of date trunc value you expect.