4paradigm / OpenMLDB

OpenMLDB is an open-source machine learning database that provides a feature platform computing consistent features for training and inference.
https://openmldb.ai
Apache License 2.0
1.57k stars 314 forks source link

Get incorrect result from rewrite sql when deploying with openmldb docker image but not for onebox #3942

Open tobegit3hub opened 1 month ago

tobegit3hub commented 1 month ago

Bug Description

We are using openmldb 0.9.0 docker image to setup the cluster with ./init.sh. The following sql will get incorrect result.

create database if not exists tpcc;

use tpcc;

create table orders (  o_w_id       integer,  o_d_id       integer,  o_id         integer,  o_c_id       integer,  o_carrier_id integer,  o_ol_cnt     integer,  o_all_local  integer,  o_entry_d    timestamp,  INDEX(KEY=o_c_id, TS=o_entry_d));

      SELECT o_id, avg_7d_cnt, avg_15d_cnt, avg_30d_cnt, max_7d_cnt, max_15d_cnt, max_30d_cnt, min_7d_cnt, min_15d_cnt, min_30d_cnt
      FROM (
        SELECT o_id,
          CAST(avg(o_ol_cnt) OVER w_1 AS DOUBLE) AS avg_7d_cnt,
          CAST(avg(o_ol_cnt) OVER w_3 AS DOUBLE) AS avg_15d_cnt,
          CAST(avg(o_ol_cnt) OVER w_5 AS DOUBLE) AS avg_30d_cnt,
          max(o_ol_cnt) OVER w_1 AS max_7d_cnt,
          max(o_ol_cnt) OVER w_3 AS max_15d_cnt,
          max(o_ol_cnt) OVER w_5 AS max_30d_cnt,
          min(o_ol_cnt) OVER w_1 AS min_7d_cnt,
          min(o_ol_cnt) OVER w_3 AS min_15d_cnt,
          min(o_ol_cnt) OVER w_5 AS min_30d_cnt,
          label FROM (
            SELECT 1 AS o_w_id, 1 AS o_d_id, 1 AS o_id, 930 AS o_c_id, 2 AS o_carrier_id, 11 AS o_ol_cnt, 1 AS o_all_local, timestamp("2024-05-25 17:00:26") AS O_ENTRY_D, 0 as label
            UNION ALL
            SELECT *, 1 as label FROM orders
         ) t
         WINDOW
           w_1 AS (PARTITION BY o_c_id ORDER BY o_entry_d ROWS between 1  PRECEDING and current row),
           w_3 AS (PARTITION BY o_c_id ORDER BY o_entry_d ROWS between 3  PRECEDING and current row),
           w_5 AS (PARTITION BY o_c_id ORDER BY o_entry_d ROWS between 5 PRECEDING and current row)
      ) t WHERE label = 0;

However, we can get the correct result when using onebox to deploy.