ClickHouse / ClickHouse

ClickHouse® is a real-time analytics DBMS
https://clickhouse.com
Apache License 2.0
37.68k stars 6.91k forks source link

NaN Comparison Issue in WHERE Clause #71899

Open dwenking opened 5 days ago

dwenking commented 5 days ago

Company or project name

WingTecher Lab

Describe what's wrong

When filtering with a condition c0 != -8.0324759543107315e+37, no rows are returned even though c0 contains NaN, which should not equal any numeric value.

How to reproduce

Environment:

(base) dwenking@Macbook clickhouse % curl https://clickhouse.com/ | sh
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100  2822    0  2822    0     0  10473      0 --:--:-- --:--:-- --:--:-- 10649

Will download https://builds.clickhouse.com/master/macos-aarch64/clickhouse into clickhouse

  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100 70.4M  100 70.4M    0     0  6773k      0  0:00:10  0:00:10 --:--:-- 8257k

Successfully downloaded the ClickHouse binary, you can run it as:
    ./clickhouse
(base) dwenking@Macbook clickhouse % ./clickhouse
...
ClickHouse local version 24.9.1.1350 (official build).

Test case:

Set up:

CREATE TABLE t0 (c0 Int16, c1 UInt32) ORDER BY c0;
INSERT INTO t0 (c0, c1) VALUES (-2913, 7583471);
CREATE TABLE t1 ORDER BY c0 AS (SELECT (varSamp(c0)) AS c0, c1 AS c1 FROM t0 GROUP BY c1);

Macbook.local :) SELECT * FROM t1;

SELECT *
FROM t1

Query id: 644f1d5f-e1fd-406e-9ac3-9a3f25be21cb

   ┌──c0─┬──────c1─┐
1. │ nan │ 7583471 │
   └─────┴─────────┘

1 row in set. Elapsed: 0.003 sec.

Macbook.local :) SELECT c0, c1 FROM t1 WHERE ((c0 != (-8.0324759543107315e+37)));

Execute this SELECT:

Macbook.local :) SELECT c0, c1 FROM t1 WHERE ((c0 != (-8.0324759543107315e+37)));

SELECT
    c0,
    c1
FROM t1
WHERE c0 != -8.0324759543107315e37

Query id: cf74a4b0-de4a-4bf7-b1c1-b81a658cc6ca

Ok.

0 rows in set. Elapsed: 0.006 sec. 

Expected returns (nan, 7583471) from t0 since nan != -8.0324759543107315.

Expected behavior

The query should return the row (NaN, 7583471) from t1.

geldot commented 4 days ago

Interesting. Returns no results when c0 includes nan and c0 is in the ORDER BY:

CREATE TABLE t0 (c0 Float32) ENGINE MergeTree() ORDER BY c0;
INSERT INTO t0 VALUES (1), (2), (nan);
SELECT * FROM t0 WHERE c0 != 1;

0 rows in set. Elapsed: 0.008 sec.

Result as expected when no order key specified:

CREATE TABLE t0 (c0 Float32) ENGINE MergeTree() ORDER BY ();
INSERT INTO t0 VALUES (1), (2), (nan);
SELECT * FROM t0 WHERE c0 != 1;

   ┌──c0─┐
1. │   2 │
2. │ nan │
   └─────┘

2 rows in set. Elapsed: 0.006 sec.

Result as expected when nan not present:

CREATE TABLE t0 (c0 Float32) ENGINE MergeTree() ORDER BY c0;
INSERT INTO t0 VALUES (1), (2);
SELECT * FROM t0 WHERE c0 != 1;

   ┌─c0─┐
1. │  2 │
   └────┘

1 row in set. Elapsed: 0.003 sec.