FirebirdSQL / firebird

Firebird server, client and tools
https://www.firebirdsql.org/
1.23k stars 213 forks source link

Issue: Different plans between firebird 2.5 and firebird 3.0 with the same query #7172

Open azapater opened 2 years ago

azapater commented 2 years ago

We are migrating from firebird 2.5.9 to 3.0.9 (at the moment we are testing the same scenario with the current 3.0.10 snapshot) and we've detected that our app is slower that before in some queries. After detailed investigation we have isolated one of the main reasons: in specific scenarios, fb3 uses different plans compared with fb2, being the one chosen by fb 2 initially more optimal.

In both cases we are using firebird 64 bits windows version. We have tested all server modes in fb3 (just in case that could affect the caching process) with same results.

To reproduce the problem, two databases are needed: one in fb2 and another one in fb3.

I attach a script that creates a table and inserts the required data. This data is exported from one of our databases to replicate exactly the same issue. We have tested random values inserting thousands of rows, but the plan works fine in that case, so we assume the issue is specifically with the data we use and how that affects the indeces.

createTestScript.zip

The script is fully compatible between 2.5 and 3.0. It creates a table with 3 fields, inserts around 150k rows and creates the same indeces we use.

I also attach both databases in case they need some study or it makes it easier to test. The zip file contains:

testDBs.zip

test25.fdb -> firebird 2.5.9 version test30.fdb -> firebird 3.0.10 version

The query to reproduce the problem is:

SELECT *
FROM test
WHERE field1 = 60001108 AND field2 IS NULL 

This query, in firebird 2.5 uses both indices and fetches just a few rows. The plan and result is as follows:

Plan
PLAN (TEST INDEX (TEST_F2_NDX, TEST_F1_NDX)) 

------ Performance info ------
Prepare time = 0ms
Execute time = 0ms
Avg fetch time = 0,00 ms
Current memory = 34.956.712
Max memory = 35.011.128
Memory buffers = 2.048
Reads from disk to cache = 12
Writes from cache to disk = 0
Fetches from cache = 18 <--- CORRECT

In firebird 3, it uses just the index related with FIELD2 and fetches some thousands of rows. The Plan and result is as follows:

Plan
PLAN (TEST INDEX (TEST_F2_NDX))

------ Performance info ------
Prepare time = 0ms
Execute time = 15ms
Avg fetch time = 15,00 ms
Current memory = 36.286.928
Max memory = 36.349.440
Memory buffers = 2.048
Reads from disk to cache = 0
Writes from cache to disk = 0
Fetches from cache = 43.787 <--- INCORRECT

After testing thoroughly, we have detected some interesting behaviours. If we create the indeces before the data is inserted, the plan works as expected and firebird 3 uses both as it used to in 2.5. Once we export and import the db using gbak, the behaviour changes to use just one index. We assume that happens because gbak creates the indeces after the data has been imported, hence the same behaviour as the script, that creates the indeces at the end of the process.

dyemanov commented 2 years ago

If indices are created before inserting rows, their statistics is zero and the optimizer considers the worst case. If statistics is up-to-date, then decisions are based on that statistics.

The problem here is that TEST_F2_NDX is more selective than TEST_F1_NDX, but many its values are actually NULLs, i.e. distribution is skewed. The optimizer treats "field2 IS NULL" similar to "field2 = value" and expects it to return nearly one row, while in fact almost half of the rows are returned. It cannot be easily improved inside the engine without storing more detailed statistics. Your easy workaround could be to disable usage of TEST_F2_NDX by hinting:

SELECT *
FROM test
WHERE field1 = 60001108 AND field2+0 IS NULL

which will you only index TEST_F1_NDX with only 13-14 page fetches (even better than v2.5 does).