FirebirdSQL / firebird

Firebird server, client and tools
https://www.firebirdsql.org/
1.23k stars 213 forks source link

Regression in queries with FETCH FIRST ROW ONLY #7799

Open sim1984 opened 11 months ago

sim1984 commented 11 months ago

It seems that in some cases, implicitly applying the FIRST ROW optimization strategy degrades query performance.

The statistics for executing the following query in Firebird 4.0 and 5.0 are as follows:

WITH
  R AS (
    SELECT
      D.AVALUE AS DISTANCE,
      MIN(TL.TIME_PASSED_SEC) AS TIME_PASSED_SEC
    FROM
      TRIAL_LINE TL
      JOIN TRIAL T
        ON TL.CODE_TRIAL = T.CODE_TRIAL
      JOIN DISTANCE D
        ON T.CODE_DISTANCE = D.CODE_DISTANCE
    WHERE TL.CODE_HORSE = 742363
      AND D.AVALUE IN (1000, 1600, 2400, 3600, 4800)
    GROUP BY D.AVALUE
  )
SELECT * FROM R
ORDER BY IIF(R.DISTANCE = 1000, 1601, R.DISTANCE)
FETCH FIRST ROW ONLY

Firebird 4.0.3

Select Expression
    -> First N Records
        -> Sort (record length: 44, key length: 8)
            -> Aggregate
                -> Sort (record length: 108, key length: 8)
                    -> Nested Loop Join (inner)
                        -> Filter
                            -> Table "TRIAL_LINE" as "R TL" Access By ID
                                -> Bitmap
                                    -> Index "FK_TRIAL_LINE_HORSE" Range Scan (full match)
                        -> Filter
                            -> Table "TRIAL" as "R T" Access By ID
                                -> Bitmap
                                    -> Index "PK_TRIAL" Unique Scan
                        -> Filter
                            -> Table "DISTANCE" as "R D" Access By ID
                                -> Bitmap
                                    -> Index "PK_DISTANCE" Unique Scan

    DISTANCE       TIME_PASSED_SEC
============ =====================
        1600                99.800

Current memory = 2610992848
Delta memory = -112
Max memory = 2611163232
Elapsed time = 0.005 sec
Buffers = 153600
Reads = 0
Writes = 0
Fetches = 150
Per table statistics:
--------------------------------+---------+---------+---------+---------+---------+---------+---------+---------+
 Table name                     | Natural | Index   | Insert  | Update  | Delete  | Backout | Purge   | Expunge |
--------------------------------+---------+---------+---------+---------+---------+---------+---------+---------+
RDB$INDICES                     |         |       49|         |         |         |         |         |         |
DISTANCE                        |         |        9|         |         |         |         |         |         |
TRIAL                           |         |        9|         |         |         |         |         |         |
TRIAL_LINE                      |         |        9|         |         |         |         |         |         |
--------------------------------+---------+---------+---------+---------+---------+---------+---------+---------+

Firebird 5.0 RC1

Select Expression
    -> First N Records
        -> Sort (record length: 44, key length: 8)
            -> Aggregate
                -> Nested Loop Join (inner)
                    -> Filter
                        -> Table "DISTANCE" as "R D" Access By ID
                            -> Index "UNK_DISTANCE_VALUE" List Scan (full match)
                    -> Filter
                        -> Table "TRIAL" as "R T" Access By ID
                            -> Bitmap
                                -> Index "FK_TRIAL_DISTANCE" Range Scan (full match)
                    -> Filter
                        -> Table "TRIAL_LINE" as "R TL" Access By ID
                            -> Bitmap
                                -> Index "UNQ_TRIAL_LINE_HORSE" Unique Scan

    DISTANCE       TIME_PASSED_SEC
============ =====================
        1600                99.800

Current memory = 2577991632
Delta memory = 624
Max memory = 2579873184
Elapsed time = 0.276 sec
Buffers = 153600
Reads = 0
Writes = 0
Fetches = 623456
Per table statistics:
--------------------------------+---------+---------+---------+---------+---------+---------+---------+---------+
 Table name                     | Natural | Index   | Insert  | Update  | Delete  | Backout | Purge   | Expunge |
--------------------------------+---------+---------+---------+---------+---------+---------+---------+---------+
DISTANCE                        |         |        5|         |         |         |         |         |         |
TRIAL                           |         |   123851|         |         |         |         |         |         |
TRIAL_LINE                      |         |        4|         |         |         |         |         |         |
--------------------------------+---------+---------+---------+---------+---------+---------+---------+---------+

This can be fixed by rewriting the query like this:

WITH
  R AS (
    SELECT
      D.AVALUE AS DISTANCE,
      MIN(TL.TIME_PASSED_SEC) AS TIME_PASSED_SEC
    FROM
      TRIAL_LINE TL
      JOIN TRIAL T
        ON TL.CODE_TRIAL = T.CODE_TRIAL
      JOIN DISTANCE D
        ON T.CODE_DISTANCE = D.CODE_DISTANCE
    WHERE TL.CODE_HORSE = 742363
      AND D.AVALUE IN (1000, 1600, 2400, 3600, 4800)
    GROUP BY D.AVALUE
  )
SELECT * FROM R
ORDER BY IIF(R.DISTANCE = 1000, 1601, R.DISTANCE)
FETCH FIRST ROW ONLY
OPTIMIZE FOR ALL ROWS
Select Expression
    -> First N Records
        -> Sort (record length: 44, key length: 8)
            -> Aggregate
                -> Sort (record length: 108, key length: 8)
                    -> Filter
                        -> Hash Join (inner)
                            -> Nested Loop Join (inner)
                                -> Filter
                                    -> Table "TRIAL_LINE" as "R TL" Access By ID
                                        -> Bitmap
                                            -> Index "FK_TRIAL_LINE_HORSE" Range Scan (full match)
                                -> Filter
                                    -> Table "TRIAL" as "R T" Access By ID
                                        -> Bitmap
                                            -> Index "PK_TRIAL" Unique Scan
                            -> Record Buffer (record length: 33)
                                -> Filter
                                    -> Table "DISTANCE" as "R D" Access By ID
                                        -> Bitmap
                                            -> Index "UNK_DISTANCE_VALUE" List Scan (full match)

    DISTANCE       TIME_PASSED_SEC
============ =====================
        1600                99.800

Current memory = 2578027936
Delta memory = 624
Max memory = 2579873184
Elapsed time = 0.001 sec
Buffers = 153600
Reads = 0
Writes = 0
Fetches = 55
Per table statistics:
--------------------------------+---------+---------+---------+---------+---------+---------+---------+---------+
 Table name                     | Natural | Index   | Insert  | Update  | Delete  | Backout | Purge   | Expunge |
--------------------------------+---------+---------+---------+---------+---------+---------+---------+---------+
DISTANCE                        |         |        5|         |         |         |         |         |         |
TRIAL                           |         |        9|         |         |         |         |         |         |
TRIAL_LINE                      |         |        9|         |         |         |         |         |         |
--------------------------------+---------+---------+---------+---------+---------+---------+---------+---------+

The query in which the regression manifests itself can be simplified to:

WITH
  R AS (
    SELECT
      D.AVALUE AS DISTANCE,
      MIN(TL.TIME_PASSED_SEC) AS TIME_PASSED_SEC
    FROM
      TRIAL_LINE TL
      JOIN TRIAL T
        ON TL.CODE_TRIAL = T.CODE_TRIAL
      JOIN DISTANCE D
        ON T.CODE_DISTANCE = D.CODE_DISTANCE
    WHERE TL.CODE_HORSE = 742363
    GROUP BY D.AVALUE
  )
SELECT * FROM R
ORDER BY R.DISTANCE
FETCH FIRST ROW ONLY

It seems to me that in this case the FIRST ROW strategy is wrong. We get the first record from the result of a query with GROUP BY, which must be executed completely and as quickly as possible, and only after that the first record is retrieved.

I can send you the database for reproducing the regression by email.

dyemanov commented 10 months ago

I suppose FIRST ROWS can be applied here for the outer ORDER BY (using sort keys + refetch), but it should not be propagated inside the GROUP BY, because GROUP BY and ORDER BY are done on different fields.

Please send me the database.

dyemanov commented 10 months ago

Ah, no, priorly aggregated rows cannot be refetched, so my above assumption is not valid.

sim1984 commented 10 months ago

The last ORDER BY has little effect. The query can be further simplified

SELECT
  D.AVALUE AS DISTANCE,
  MIN(TL.TIME_PASSED_SEC) AS ATIME
FROM
  TRIAL_LINE TL
  JOIN TRIAL T
    ON TL.CODE_TRIAL = T.CODE_TRIAL
  JOIN DISTANCE D
    ON T.CODE_DISTANCE = D.CODE_DISTANCE
WHERE TL.CODE_HORSE = 742363
GROUP BY D.AVALUE
FETCH FIRST ROW ONLY
EPluribusUnum commented 10 months ago

@sim1984 , SQL standard does not guarantee/require that GROUP by does ordering. ORDER BY must be present for guaranteed ordering. It's an implementation deatil that FB group by does implicit ordering, but in the future might be changed. (e.g: FB start supporting hash group by)

sim1984 commented 10 months ago

@EPluribusUnum I know that. In this case, the query has been simplified for easier debugging. To reproduce the regression, it does not matter whether the records are actually sorted and the first result will be returned. It is important that FETCH FIRST ROW ONLY implicitly switches the optimizer strategy, and in this case, a query executed with the FIRST ROWS strategy begins to slow down.

The initial SQL query where this was noticed is described above. All that is changed in it is the explicit substitution of literals instead of parameters, since it is executed inside a stored procedure.

sim1984 commented 10 months ago

In the last snapshot (buildno >= 1246) this SQL query works quickly :).

dyemanov commented 10 months ago

Yes, I know. It was buggy cost estimation for FIRST ROWS. However, it still looks like a good idea to avoid propagating FIRST ROWS into your CTE, so I'm not closing this ticket yet.