Open sim1984 opened 11 months ago
I suppose FIRST ROWS can be applied here for the outer ORDER BY (using sort keys + refetch), but it should not be propagated inside the GROUP BY, because GROUP BY and ORDER BY are done on different fields.
Please send me the database.
Ah, no, priorly aggregated rows cannot be refetched, so my above assumption is not valid.
The last ORDER BY has little effect. The query can be further simplified
SELECT
D.AVALUE AS DISTANCE,
MIN(TL.TIME_PASSED_SEC) AS ATIME
FROM
TRIAL_LINE TL
JOIN TRIAL T
ON TL.CODE_TRIAL = T.CODE_TRIAL
JOIN DISTANCE D
ON T.CODE_DISTANCE = D.CODE_DISTANCE
WHERE TL.CODE_HORSE = 742363
GROUP BY D.AVALUE
FETCH FIRST ROW ONLY
@sim1984 , SQL standard does not guarantee/require that GROUP by does ordering. ORDER BY must be present for guaranteed ordering. It's an implementation deatil that FB group by does implicit ordering, but in the future might be changed. (e.g: FB start supporting hash group by)
@EPluribusUnum
I know that. In this case, the query has been simplified for easier debugging. To reproduce the regression, it does not matter whether the records are actually sorted and the first result will be returned. It is important that FETCH FIRST ROW ONLY
implicitly switches the optimizer strategy, and in this case, a query executed with the FIRST ROWS strategy begins to slow down.
The initial SQL query where this was noticed is described above. All that is changed in it is the explicit substitution of literals instead of parameters, since it is executed inside a stored procedure.
In the last snapshot (buildno >= 1246) this SQL query works quickly :).
Yes, I know. It was buggy cost estimation for FIRST ROWS. However, it still looks like a good idea to avoid propagating FIRST ROWS into your CTE, so I'm not closing this ticket yet.
It seems that in some cases, implicitly applying the FIRST ROW optimization strategy degrades query performance.
The statistics for executing the following query in Firebird 4.0 and 5.0 are as follows:
Firebird 4.0.3
Firebird 5.0 RC1
This can be fixed by rewriting the query like this:
The query in which the regression manifests itself can be simplified to:
It seems to me that in this case the FIRST ROW strategy is wrong. We get the first record from the result of a query with GROUP BY, which must be executed completely and as quickly as possible, and only after that the first record is retrieved.
I can send you the database for reproducing the regression by email.