postgrespro / vops

Other
166 stars 22 forks source link

The result of VOPS does not match PostgreSQL #16

Closed japinli closed 5 years ago

japinli commented 5 years ago

Hi, I am recently test VOPS and find that the result of sum and avg are not the same as PostgreSQL. The following code is my test case.

postgres=# CREATE EXTENSION vops;
CREATE EXTENSION
postgres=# CREATE TABLE LINEITEM ( L_ORDERKEY    INTEGER NOT NULL,
                             L_PARTKEY     INTEGER NOT NULL,
                             L_SUPPKEY     INTEGER NOT NULL,
                             L_LINENUMBER  INTEGER NOT NULL,
                             L_QUANTITY    DECIMAL(15,2) NOT NULL,
                             L_EXTENDEDPRICE  DECIMAL(15,2) NOT NULL,
                             L_DISCOUNT    DECIMAL(15,2) NOT NULL,
                             L_TAX         DECIMAL(15,2) NOT NULL,
                             L_RETURNFLAG  CHAR(1) NOT NULL,
                             L_LINESTATUS  CHAR(1) NOT NULL,
                             L_SHIPDATE    DATE NOT NULL,
                             L_COMMITDATE  DATE NOT NULL,
                             L_RECEIPTDATE DATE NOT NULL,
                             L_SHIPINSTRUCT CHAR(25) NOT NULL,
                             L_SHIPMODE     CHAR(10) NOT NULL,
                             L_COMMENT      VARCHAR(44) NOT NULL);
CREATE TABLE
postgres=# COPY LINEITEM FROM '/path/to/lineitem1.csv' DELIMITER '|' CSV;
COPY 986052
postgres=# CREATE TABLE vops_discount( l_discount vops_float4 not null);
CREATE TABLE
postgres=# select populate(destination := 'vops_discount'::regclass, source := 'LINEITEM'::regclass);
 populate
----------
   986052
(1 row)

postgres=# select sum(l_discount) from lineitem;
   sum
----------
 49368.98
(1 row)

postgres=# select sum(l_discount) from vops_discount;
         sum
----------------------
 7.17329743361851e-32
(1 row)

postgres=# select avg(l_discount) from lineitem;
          avg
------------------------
 0.05006731896492274241
(1 row)

postgres=# select avg(l_discount) from vops_discount;
         avg
---------------------
 7.2747658679446e-38
(1 row)

The VOPS commit is 7ffe6fd9ffc62e2, and PostgreSQL is REL_10_7.

knizhnik commented 5 years ago

Sorry, but VOPS requires that types of columns in source and destination table for populate functions should be compatible. Unfortunately there was not such check, so it is possible to perform incorrect import of data. I have committed patch with such check which prohibit import as in your case when DECIMAL(15,2) is assigned to FLOAT4. You can use view which perform will cast column of original table to the requested type and pass name of this view to populate function instead of original table.

japinli commented 5 years ago

Thanks for your replay. When I use float4 it works.

japinli commented 5 years ago

According to the patch, VOPS does not support REL_10_7 anymore.

knizhnik commented 5 years ago

Sorry, this newly introduced incompatibility with 10.x version is fixed.