heterodb / pg-strom

PG-Strom - Master development repository
http://heterodb.github.io/pg-strom/
Other
1.3k stars 162 forks source link

[GPU Logic Bug] SELECT <agg_func> FROM <table> WHERE <column> IS DISTINCT FROM <column> Brings GPU Logic Bug #826

Closed qwebug closed 1 week ago

qwebug commented 1 month ago

Describe:

SELECT \ FROM \

WHERE \ IS DISTINCT FROM \ brings different results, when using CPU-only configurations and GPU-used configurations.

SQL with CPU-only Config:

CREATE TABLE t0(c0 int);
CREATE TABLE t1(LIKE t0);
INSERT INTO t1(c0) VALUES(1);
INSERT INTO t0(c0) VALUES(2);
CREATE SCHEMA extensions;
CREATE EXTENSION pg_strom WITH SCHEMA extensions;
SET pg_strom.enabled=off;
SELECT MAX(0.9118670838105292) FROM t0, t1 WHERE ((t1.c0)IS DISTINCT FROM(t0.c0));

Result:

        max         
--------------------
 0.9118670838105292
(1 row)

SQL with GPU-used Config:

SET pg_strom.enabled=on;
SELECT MAX(0.9118670838105292) FROM t0, t1 WHERE ((t1.c0)IS DISTINCT FROM(t0.c0));

Result:

 max 
-----

(1 row)

Environment:

Pg-strom Version: commit 9765660eba4fab7e4e9378c667e6bd565da83ffd

PostgreSQL Version: 15.6 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 8.5.0 20210514 (Red Hat 8.5.0-20), 64-bit

CUDA Version: 12.2

NVIDIA Driver Version: 535.171.04

kaigai commented 1 month ago

This commit you tested (9765660eba4fab7e4e9378c667e6bd565da83ffd) is old; more that a year before. I could not reproduce the problem using 9389fb758eab82af905543fe7a37097d6e55ca62, please retry it with the latest commit.

qwebug commented 1 month ago

This commit you tested (9765660eba4fab7e4e9378c667e6bd565da83ffd) is old; more that a year before. I could not reproduce the problem using 9389fb758eab82af905543fe7a37097d6e55ca62, please retry it with the latest commit.

The results are also slightly different, when checking it in commit 9389fb758eab82af905543fe7a37097d6e55ca62.

        max         
--------------------
 0.9118670838105292
(1 row)

        max        
-------------------
 0.911867083810529
(1 row)
kaigai commented 1 month ago

This is not a bug, but specification of PG-Strom.

Due to atomic operation in CUDA, we convert numeric values to float8 on max() or min().

You can see the pgstrom.pmax() takes a converted value to double precision. It is the reason why the result is a bit different.

postgres=# explain SELECT MAX(0.9118670838105292) FROM t0, t1 WHERE ((t1.c0)IS DISTINCT FROM(t0.c0));
                                       QUERY PLAN
-----------------------------------------------------------------------------------------
 Aggregate  (cost=3234.40..3234.41 rows=1 width=32)
   ->  Custom Scan (GpuPreAgg) on t0  (cost=3234.39..3234.40 rows=1 width=32)
         GPU Projection: pgstrom.pmax((0.9118670838105292)::double precision)
         GPU Join Quals [1]: (t1.c0 IS DISTINCT FROM t0.c0) ... [nrows: 2550 -> 6469988]
         GPU-Direct SQL: enabled (N=2,GPU0,1)
         ->  Seq Scan on t1  (cost=0.00..35.50 rows=2550 width=4)
(6 rows)

You can see the same result as follows:

postgres=# set pg_strom.enabled = off;
SET
postgres=# SELECT MAX(0.9118670838105292::float) FROM t0, t1 WHERE ((t1.c0)IS DISTINCT FROM(t0.c0));
        max
--------------------
 0.9118670838105292
(1 row)

And, we plan to keep precision of numeric at the issue #806 in the milestone of v5.3.