heterodb / pg-strom

PG-Strom - Master development repository
http://heterodb.github.io/pg-strom/
Other
1.3k stars 162 forks source link

[GPU Logic Bug] SELECT DISTINCT ON <column> FROM <table> WHERE <condition> Brings GPU Logic Bug #828

Closed qwebug closed 1 week ago

qwebug commented 1 month ago

Describe:

SELECT DISTINCT ON \ FROM \

WHERE \ brings different results, when using CPU-only configurations and GPU-used configurations.

SQL with CPU-only Config:

CREATE TABLE t2(c0 bigint UNIQUE PRIMARY KEY GENERATED BY DEFAULT AS IDENTITY NOT NULL, c1 DECIMAL );
CREATE TABLE t3(LIKE t2);
INSERT INTO t2(c0, c1) VALUES(4, 0.8), (0, 0.02), (-1, 0.2);
INSERT INTO t3(c0, c1) VALUES(-1, 0.4);
INSERT INTO t3(c1, c0) VALUES(0.4, -1), (0.2, -2), (0.08, -1);
CREATE SCHEMA extensions;
CREATE EXTENSION pg_strom WITH SCHEMA extensions;
SET pg_strom.enabled=off;
SELECT DISTINCT ON (t2.c1) * FROM t3, t2 WHERE ((t3.c1)!=(t2.c1));

Result:

 c0 |  c1  | c0 |  c1  
----+------+----+------
 -1 |  0.4 |  0 | 0.02
 -1 | 0.08 | -1 |  0.2
 -2 |  0.2 |  4 |  0.8
(3 rows)

SQL with GPU-used Config:

SET pg_strom.enabled=on;
SELECT DISTINCT ON (t2.c1) * FROM t3, t2 WHERE ((t3.c1)!=(t2.c1));

Result:

 c0 |  c1  | c0 |  c1  
----+------+----+------
 -1 |  0.4 |  0 | 0.02
 -1 | 0.08 | -1 |  0.2
 -1 |  0.4 |  4 |  0.8
(3 rows)

Environment:

Pg-strom Version: commit 9765660eba4fab7e4e9378c667e6bd565da83ffd

PostgreSQL Version: 15.6 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 8.5.0 20210514 (Red Hat 8.5.0-20), 64-bit

CUDA Version: 12.2

NVIDIA Driver Version: 535.171.04

kaigai commented 1 month ago

This is not a bug, but specification. PG-Strom runs per-row evaluation in parallel, so GpuScan's results order is not guaranteed. This query specifies DISTINCT ON (t2.c1), thus t2.c1 is unique but other columns are not expectational.

These values were come from the tuple which scceeded atomicCAS operation on the GPU result buffer, and this behavior is not illigal.