duckdb / duckdb

DuckDB is an analytical in-process SQL database management system
http://www.duckdb.org
MIT License
24.22k stars 1.92k forks source link

SELECT DISTINCT ON does not respect ORDER BY #6524

Closed zenazn closed 1 year ago

zenazn commented 1 year ago

What happens?

When running a SELECT DISTINCT ON ... ORDER BY ... query, I expect DuckDB to return the first row—as defined by the query's ORDER BY clause—for each unique DISTINCT ON. Instead, DuckDB picks a random row.

Postgres docs for the behavior I expect.

To Reproduce

Extracted from a production use case:

CREATE TABLE fuzzy_data AS (SELECT * FROM VALUES (1, 2, 'a'), (1, null, 'b'), (2, null, 'c'), (null, 4, 'd') t(a, b, foo));

SELECT DISTINCT ON (t.a)
"__join1"."a" IS NULL as j1a,
"__join1"."b" IS NULL as j1b,
"__join2"."a" IS NULL as j2a,
"__join2"."b" IS NULL as j2b,
FROM VALUES (1) t(a)
LEFT JOIN "fuzzy_data" AS "__join1" ON ("__join1"."a" IS NOT DISTINCT FROM 1 OR "__join1"."a" IS NULL) AND ("__join1"."b" IS NOT DISTINCT FROM 2 OR "__join1"."b" IS NULL)
LEFT JOIN "fuzzy_data" AS "__join2" ON ("__join2"."a" IS NOT DISTINCT FROM 1 OR "__join2"."a" IS NULL) AND ("__join2"."b" IS NOT DISTINCT FROM 3 OR "__join2"."b" IS NULL)
ORDER BY t.a, "__join1"."a" IS NULL, "__join1"."b" IS NULL, "__join2"."a" IS NULL, "__join2"."b" IS NULL;

Here's the result of running a few variants of that query in a duckdb 0.7.1 shell:

  1. The query above
  2. The query minus the DISTINCT ON (t.a) part
  3. EXPLAIN of that query
┌─────────┬─────────┬─────────┬─────────┐
│   j1a   │   j1b   │   j2a   │   j2b   │
│ boolean │ boolean │ boolean │ boolean │
├─────────┼─────────┼─────────┼─────────┤
│ false   │ true    │ false   │ true    │
└─────────┴─────────┴─────────┴─────────┘
┌─────────┬─────────┬─────────┬─────────┐
│   j1a   │   j1b   │   j2a   │   j2b   │
│ boolean │ boolean │ boolean │ boolean │
├─────────┼─────────┼─────────┼─────────┤
│ false   │ false   │ false   │ true    │
│ false   │ true    │ false   │ true    │
└─────────┴─────────┴─────────┴─────────┘

┌─────────────────────────────┐
│┌───────────────────────────┐│
││       Physical Plan       ││
│└───────────────────────────┘│
└─────────────────────────────┘
┌───────────────────────────┐
│          ORDER_BY         │
│   ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─   │
│          ORDERS:          │
│           #5 ASC          │
│  (__join1.a IS NULL) ASC  │
│  (__join1.b IS NULL) ASC  │
│  (__join2.a IS NULL) ASC  │
│  (__join2.b IS NULL) ASC  │
└─────────────┬─────────────┘
┌─────────────┴─────────────┐
│         PROJECTION        │
│   ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─   │
│             #1            │
│             #2            │
│             #3            │
│             #4            │
│             #0            │
│             #5            │
└─────────────┬─────────────┘
┌─────────────┴─────────────┐
│       HASH_GROUP_BY       │
│   ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─   │
│             #0            │
│         first(#1)         │
│         first(#2)         │
│         first(#3)         │
│         first(#4)         │
│         first(#5)         │
└─────────────┬─────────────┘
┌─────────────┴─────────────┐
│         PROJECTION        │
│   ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─   │
│             #4            │
│             #0            │
│             #1            │
│             #2            │
│             #3            │
│             #5            │
└─────────────┬─────────────┘
┌─────────────┴─────────────┐
│         PROJECTION        │
│   ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─   │
│            j1a            │
│            j1b            │
│            j2a            │
│            j2b            │
│             a             │
│             a             │
└─────────────┬─────────────┘
┌─────────────┴─────────────┐
│     BLOCKWISE_NL_JOIN     │
│   ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─   │
│            LEFT           ├───────────────────────────────────────────┐
│            true           │                                           │
└─────────────┬─────────────┘                                           │
┌─────────────┴─────────────┐                             ┌─────────────┴─────────────┐
│     BLOCKWISE_NL_JOIN     │                             │           FILTER          │
│   ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─   │                             │   ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─   │
│            LEFT           │                             │ (((a IS NULL) OR (a IS NOT│
│            true           │                             │  DISTINCT FROM 1)) AND ((b│
│                           ├──────────────┐              │    IS NULL) OR (b IS NOT  │
│                           │              │              │     DISTINCT FROM 3)))    │
│                           │              │              │   ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─   │
│                           │              │              │           EC: 0           │
└─────────────┬─────────────┘              │              └─────────────┬─────────────┘
┌─────────────┴─────────────┐┌─────────────┴─────────────┐┌─────────────┴─────────────┐
│      COLUMN_DATA_SCAN     ││           FILTER          ││          SEQ_SCAN         │
│                           ││   ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─   ││   ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─   │
│                           ││ (((a IS NULL) OR (a IS NOT││         fuzzy_data        │
│                           ││  DISTINCT FROM 1)) AND ((b││   ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─   │
│                           ││    IS NULL) OR (b IS NOT  ││             a             │
│                           ││     DISTINCT FROM 2)))    ││             b             │
│                           ││   ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─   ││   ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─   │
│                           ││           EC: 0           ││           EC: 0           │
└───────────────────────────┘└─────────────┬─────────────┘└───────────────────────────┘
                             ┌─────────────┴─────────────┐
                             │          SEQ_SCAN         │
                             │   ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─   │
                             │         fuzzy_data        │
                             │   ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─   │
                             │             a             │
                             │             b             │
                             │   ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─   │
                             │           EC: 0           │
                             └───────────────────────────┘

From the query plan it's pretty clear the ORDER_BY block is on the wrong side of the HASH_GROUP_BY block

OS:

OSX, M1

DuckDB Version:

0.7.1

DuckDB Client:

shell

Full Name:

Carl Jackson

Affiliation:

Watershed

Have you tried this on the latest master branch?

Have you tried the steps to reproduce? Do they include all relevant data and configuration? Does the issue you report still appear there?

Mytherin commented 1 year ago

Duplicate of #2041. As explained there moving the ORDER BY below the HASH AGGREGATE is not a sufficient solution in a multi-threaded system - that only works in a single-threaded execution model. Instead what you want to do is push the ORDER BY into the FIRST aggregates, so that they become e.g. FIRST(b ORDER BY a). If the order clause is simple enough the system could also translate it into a MAX_BY/MIN_BY to improve execution speed.

Mytherin commented 1 year ago

Fixed now in #6616