exasol / metabase-driver

Exasol driver for Metabase
GNU Affero General Public License v3.0
5 stars 0 forks source link

"Week" aggregation is inconsistent with setting "Start of the week" is set to "Monday" #59

Closed ppavlov39 closed 1 year ago

ppavlov39 commented 1 year ago

Describe the bug

Hello! When setting "Start-of-the-week" is set to "Monday" and we use a query builder then a weekly aggregation query gets wrong result - the first day of the aggregation is the Tuesday of previous week. If setting "Start-of-the-week" is set to "Sunday" all works correct.

Steps To Reproduce

Steps to reproduce the behavior:

  1. Select "Monday" as "Start of the week" in Admin settings.
  2. Aggregate data on "Week"
  3. Find data belonging to previous week with Tuesday as first day of week.

Expected behavior

Data from previous week is not included in the result and first day of week is Monday.

Example

  1. Prepare a query in the query builder with filter on date field image
  2. Get results from previous week with wrong first day (27.12, 03.01 and 10.01 are Tuesdays) image

Here is the query that MB generates:

SELECT (CAST(truncate(CAST(("BD"."TABLE"."COLUMN" + numtodsinterval(-1, 'day')) AS date), 'day') AS date) + numtodsinterval(1, 'day')) AS "COLUMN"
FROM "BD"."TABLE"
WHERE ("BD"."TABLE"."COLUMN" >= timestamp '2023-01-02 00:00:00.000'
   AND "BD"."TABLE"."COLUMN" < timestamp '2023-01-16 00:00:00.000')
GROUP BY (CAST(truncate(CAST(("BD"."TABLE"."COLUMN" + numtodsinterval(-1, 'day')) AS date), 'day') AS date) + numtodsinterval(1, 'day'))
ORDER BY (CAST(truncate(CAST(("BD"."TABLE"."COLUMN" + numtodsinterval(-1, 'day')) AS date), 'day') AS date) + numtodsinterval(1, 'day')) ASC

And executing this query without MB gives the same result.

Setup

This behavior is also present in metabase 0.42.4 with driver version 1.0.0

Dependencies

kaklakariada commented 1 year ago

@ppavlov39 thank you for your detailed and helpful bug report! We will investigate how to fix this.

kaklakariada commented 1 year ago

Hi @ppavlov39, I can reproduce this with the latest versions of Metabase and the Exasol driver. However I could not find a reason why Metabase generates the observed SQL queries. I tried it with MySQL and get exactly the same result. The generated queries are also similar.

Queries ### First day of week in Metabase: Sunday #### Exasol ```sql SELECT (CAST(truncate(CAST(("BD"."TABLE"."COLUMN" + numtodsinterval(-1, 'day')) AS date), 'day') AS timestamp) + numtodsinterval(1, 'day')) AS "COLUMN", count(*) AS "count" FROM "BD"."TABLE" WHERE ("BD"."TABLE"."COLUMN" >= timestamp '2023-01-02 00:00:00.000' AND "BD"."TABLE"."COLUMN" < timestamp '2023-01-16 00:00:00.000') GROUP BY (CAST(truncate(CAST(("BD"."TABLE"."COLUMN" + numtodsinterval(-1, 'day')) AS date), 'day') AS timestamp) + numtodsinterval(1, 'day')) ORDER BY (CAST(truncate(CAST(("BD"."TABLE"."COLUMN" + numtodsinterval(-1, 'day')) AS date), 'day') AS timestamp) + numtodsinterval(1, 'day')) ASC ``` #### MySQL ```sql SELECT date_add(str_to_date(concat(yearweek(date_add(`TAB`.`col`, INTERVAL -1 day)), ' Sunday'), '%X%V %W'), INTERVAL 1 day) AS `col`, count(*) AS `count` FROM `TAB` WHERE (`TAB`.`col` >= convert_tz('2023-01-02 00:00:00.000', 'UTC', @@session.time_zone) AND `TAB`.`col` < convert_tz('2023-01-16 00:00:00.000', 'UTC', @@session.time_zone)) GROUP BY date_add(str_to_date(concat(yearweek(date_add(`TAB`.`col`, INTERVAL -1 day)), ' Sunday'), '%X%V %W'), INTERVAL 1 day) ORDER BY date_add(str_to_date(concat(yearweek(date_add(`TAB`.`col`, INTERVAL -1 day)), ' Sunday'), '%X%V %W'), INTERVAL 1 day) ASC ``` ### First day of week in Metabase: Monday #### Exasol ```sql SELECT truncate(CAST("BD"."TABLE"."COLUMN" AS date), 'day') AS "COLUMN", count(*) AS "count" FROM "BD"."TABLE" WHERE ("BD"."TABLE"."COLUMN" >= timestamp '2023-01-02 00:00:00.000' AND "BD"."TABLE"."COLUMN" < timestamp '2023-01-16 00:00:00.000') GROUP BY truncate(CAST("BD"."TABLE"."COLUMN" AS date), 'day') ORDER BY truncate(CAST("BD"."TABLE"."COLUMN" AS date), 'day') ASC ``` #### MySQL ```sql SELECT str_to_date(concat(yearweek(`TAB`.`col`), ' Sunday'), '%X%V %W') AS `col`, count(*) AS `count` FROM `TAB` WHERE (`TAB`.`col` >= convert_tz('2023-01-02 00:00:00.000', 'UTC', @@session.time_zone) AND `TAB`.`col` < convert_tz('2023-01-16 00:00:00.000', 'UTC', @@session.time_zone)) GROUP BY str_to_date(concat(yearweek(`TAB`.`col`), ' Sunday'), '%X%V %W') ORDER BY str_to_date(concat(yearweek(`TAB`.`col`), ' Sunday'), '%X%V %W') ASC ```

I also verified that the Exasol driver correctly reports Sunday as db-start-of-week, which is the default for Exasol. You can verify this with this query:

SELECT SESSION_PARAMETER(current_session, 'NLS_FIRST_DAY_OF_WEEK') AS SESSION_VALUE

This should return 7, meaning Sunday (1-7 for Monday-Sunday).

For me this looks like a general issue with Metabase and I am not sure how to fix this in the Exasol driver.

redcatbear commented 1 year ago

If it is indeed a Metabase issue, it should be fixed at the source.

ppavlov39 commented 1 year ago

I also thought that it was metabase problem, and almost created a new issue, but then I read this one and some others, where it was indicated that it was solved by fixing a DB driver. Could you check that issue, please.

kaklakariada commented 1 year ago

@ppavlov39 thank you for the hint, I will look into this. Because of bad timing I already created an issue for metabase: https://github.com/metabase/metabase/issues/28018

kaklakariada commented 1 year ago

Hi @ppavlov39, I think I found the root cause. Please run the following query to check the value of the NLS_FIRST_DAY_OF_WEEK setting:

select * from EXA_PARAMETERS where parameter_name = 'NLS_FIRST_DAY_OF_WEEK';

I assume it is 1 (= Monday) on your system. When I set it to 1 I get this wrong result for the query, exactly like you:

COLUMN               count  
-------------------  -----  
2022-12-27 00:00:00  1      
2023-01-03 00:00:00  7      
2023-01-10 00:00:00  6      

Please try to set the value to 7 (= Sunday, default) with alter system set NLS_FIRST_DAY_OF_WEEK = 7; and create a new connection to the database, then you should get the correct result.

The Exasol driver for Metabase always uses Sunday as first day of the week (= default for Exasol) due to restrictions of the Metabase API. There is an open issue in Metabase to allow drivers to read the current setting from the database.

ppavlov39 commented 1 year ago

Thanks a lot! I've checked this and got a correct result. If I understand you correctly, now we have to wait when Metabase API will allow drivers to read the current setting for the start of the week from the database?

redcatbear commented 1 year ago

Yes, that is the precondition. I am marking this ticket as blocked by metabase/metabase#13496

redcatbear commented 1 year ago

Since this ticket is still blocked, I am moving it back from "in progress" to the backlog.

redcatbear commented 1 year ago

Upstream ticket still open.

redcatbear commented 1 year ago

The upstream ticket is still open with no progress since 2020. Realistically, I don't think Metabase will fix that, so I am closing this blocked ticket here.

Should Metabase fix that, @ppavlov39 please feel free to reopen this ticket here.

kaklakariada commented 1 year ago

This is documented as a known issue in the user guide (https://github.com/exasol/metabase-driver/issues/63 / https://github.com/exasol/metabase-driver/pull/64).