cube-js / cube

📊 Cube — Universal semantic layer platform for AI, BI, spreadsheets, and embedded analytics
https://cube.dev
Other
17.98k stars 1.78k forks source link

Cannot represent '-109999999000000' with scale 5 without loosing precision #8858

Open kevinleeTCA opened 1 month ago

kevinleeTCA commented 1 month ago

Problem

Switched cube pre-aggregation storage (memory driver) from postgres to cubestore, encountered several feature not supported and data related precision issues, one of them is:

ERROR [cubestore::cluster::ingestion::job_runner] <pid:1> Running job error (71.195617ms): IdRow { id: 28798, row: Job { row_reference: Table(Tables, 9052), job_type: TableImportCSV("temp://dev_pre_aggregations.glrc_rent_rollup20230101_5zqy4hdw_1phqk2e_1jhgfsl-0.csv.gz"), last_heart_beat: 2024-10-23T00:13:13.558610173Z, status: Error("Can't parse '-109999999000000' column value for 'glrc__rent_amount' column: cannot represent '-109999999000000' with scale 5 without loosing precision") } }

Related rollup

preAggregations: {
    rentRollup: {
      measures: [CUBE.rentAmount, CUBE.feeIncomeAmount, CUBE.managementFeeIncomeAmount, CUBE.count],
      dimensions: [CUBE.generalLedgerManagement, CUBE.accountOwner, CUBE.accountBookType],
      timeDimension: CUBE.createdAt,
      granularity: `day`,
      indexes: {
        mgIdx: {
          columns: [CUBE.generalLedgerManagement],
        },
      },
 }
      ....
  measures: {
    rentAmount: {
      sql: `amount`,
      type: `sum`,
      filters: [
        {
          sql: `${CUBE.transactionTaxCategoryId} = 'RENT'`,
        },
      ],
    },
   },
    ...

so the type of amount we defined in postgres is bigint (8 bytes, -9223372036854775808 to +9223372036854775807 ), we use it to represents the money amount in cents (an big integer)

so what do we convert it to when doing the measure of type sum ? why is this value -109999999000000 an issue here ?

The current cube version and cubestore version we are using is 1.0.1 (latest one published a couple of days ago),

kevinleeTCA commented 1 week ago

We currently fix it via converting it explicitly to float,

 rentAmount: {
      sql: `amount::float`,
      type: `sum`,
    },

but I am not sure if this is correct, as bigint on postgres is 8 bytes. If float in cube is 4 bytes, we still will have the overflow issues when big value appears. @paveltiunov do you know the actual length of float in cube ?