FirebirdSQL / firebird

Firebird server, client and tools
https://www.firebirdsql.org/
1.19k stars 205 forks source link

Bad optimization of queries with DB_KEY [CORE1295] #1716

Closed firebird-automations closed 14 years ago

firebird-automations commented 17 years ago

Submitted by: Alexander Tyutik (tut)

Is duplicated by CORE3997 Is related to QA509 Relate to CORE4530

To reproduce the problem you need any table from your DB and next procedure:

CREATE PROCEDURE GET_DB_KEY RETURNS ( "DB_KEY" CHAR(8) CHARACTER SET OCTETS) AS BEGIN SELECT RDB$DB_KEY FROM YOUR_TABLE WHERE ID = 3 INTO :DB_KEY; SUSPEND; END

Now good query with my test table called "Tree" (2 indexed reads):

----------------------------------------------------------------------------------------------------------------------

SELECT * FROM GET_DB_KEY G LEFT JOIN "Tree" T ON T.RDB$DB_KEY = G."DB_KEY"

PLAN JOIN (Tree INDEX (PK_Tree), T INDEX ())

----------------------------------------------------------------------------------------------------------------------

Bad queries:

----------------------------------------------------------------------------------------------------------------------

SELECT * FROM GET_DB_KEY G JOIN "Tree" T ON T.RDB$DB_KEY = G."DB_KEY"

PLAN JOIN (Tree INDEX (PK_Tree), T NATURAL)

----------------------------------------------------------------------------------------------------------------------

SELECT * FROM "Tree" WHERE RDB$DB_KEY = (SELECT "DB_KEY" FROM GET_DB_KEY)

PLAN (Tree INDEX (PK_Tree)) PLAN (Tree NATURAL)

----------------------------------------------------------------------------------------------------------------------

In both queries all records from table "Tree" were fetched

Commits: FirebirdSQL/firebird@134d923fa141529ddb1d5be64dec1be1cbd239c9 FirebirdSQL/firebird@57f2da480b809db22070f7791e1d18c03659679e FirebirdSQL/firebird@bebc026b62f5dac7d18f9d2a0563efd1af1b4ae0 FirebirdSQL/firebird@a1d9ca672642362374c3d52c6106619e0ddd06d7 FirebirdSQL/firebird@b07ef45ae6a4d4816ae36cfa2e638fc3a9c19af5 FirebirdSQL/firebird@c7853fa9e19c4c7ff8d3c6d353dfd1f0a21f9364 FirebirdSQL/fbt-repository@a8ad99b9ad8901e092a9cfda4b1f3403188893ae

firebird-automations commented 17 years ago

Commented by: @pcisar

I can't comprehend what you're trying to do here, as you're mixing apples and oranges in a way that doesn't make sense.

1. Retrieval via RDB$DB_KEY can't use any index, as it's meaningless . DB_KEY is "direct pointer" to the row, so retrieval is mere calculation of the location (data page + slot) and off you go.

2. Join of two different tables via their DB_KEY is pointless as they can't match from obvious reason (similar to comparison of two memory pointers to different objects for equality).

firebird-automations commented 17 years ago
Modified by: @pcisar status: Open \[ 1 \] =\> Resolved \[ 5 \] resolution: Won't Fix \[ 2 \]
firebird-automations commented 17 years ago

Commented by: Alexander Tyutik (tut)

> I can't comprehend what you're trying to do here, as you're mixing apples and oranges in a way that doesn't make sense.

No. In my version of procedure get DB_KEY from table "Tree" too.

> Join of two different tables via their DB_KEY is pointless

I haven't join diffirent tables. Read above. Seems i explained problem not clear :-(

Here is my version of procedure:

CREATE PROCEDURE GET_DB_KEY RETURNS ( "DB_KEY" CHAR(8) CHARACTER SET OCTETS) AS BEGIN SELECT RDB$DB_KEY FROM "Tree" WHERE ID = 3 INTO :DB_KEY; SUSPEND; END

and i wanted to say that you should use any existing table from your database instead of YOUR_TABLE. And then i wrote that i used table "Tree". You may use any other table, but should recompile procedure vith valid table name and ID value (or what you use instead of id).

firebird-automations commented 17 years ago

Commented by: @pcisar

Ok, so you at least compare apples to apples, but that doesn't invalidate the first point, that retrieval via RDB$DB_KEY don't use indices by design, as DB_KEY is in fact a pointer. In fact, using index would be slower than direct calculation of the physical row location from DB_KEY.

firebird-automations commented 17 years ago
Modified by: @pcisar status: Resolved \[ 5 \] =\> Closed \[ 6 \]
firebird-automations commented 17 years ago

Commented by: Alexander Tyutik (tut)

Im my test sample there are 10 records in table "Tree".

Now compare next queries and plans:

=====================================================================

SELECT * FROM GET_DB_KEY G LEFT JOIN "Tree" T ON T.RDB$DB_KEY = G."DB_KEY";

PLAN JOIN (Tree INDEX (PK_Tree), T INDEX ())

2 index reads from "Tree" here

----------------------------------------------------------------------------------------------------------------------

SELECT * FROM GET_DB_KEY G JOIN "Tree" T ON T.RDB$DB_KEY = G."DB_KEY";

PLAN JOIN (Tree INDEX (PK_Tree), T NATURAL)

10 non index reads and 1 index read here

=====================================================================

EXECUTE BLOCK RETURNS ( ID INTEGER ) AS DECLARE DBKEY CHAR(8) CHARACTER SET OCTETS; BEGIN FOR SELECT "DB_KEY" FROM GET_DB_KEY INTO :DBKEY DO BEGIN SELECT ID FROM "Tree" T WHERE T.RDB$DB_KEY = :DBKEY INTO :ID; SUSPEND; END END

PLAN (T INDEX ()) PLAN (Tree INDEX (PK_Tree))

2 index reads from "Tree" here

----------------------------------------------------------------------------------------------------------------------

SELECT ID FROM "Tree" WHERE RDB$DB_KEY = (SELECT "DB_KEY" FROM GET_DB_KEY)

PLAN (Tree INDEX (PK_Tree)) PLAN (Tree NATURAL)

10 non index and 1 index read frem "Tree" here

=====================================================================

You still don't see the problem?

firebird-automations commented 17 years ago

Commented by: @hvlad

Pavel, access via DB_KEY always shown in execution statistics as indexed read, IIRC

firebird-automations commented 17 years ago

Commented by: @pcisar

Vlad, that makes sense :-) But as you can see, DB_KEY access method is reported inconsistently in human readable form of execution plan. It's either INDEX without index name or NATURAL (depends on the context). Don't know what would be an outcome of any attempt to specify explicit execution plan that involve access via DB_KEY in this situation. Maybe we can fill an improvement request to report DB_KEY access method via special keyword (DB_KEY for example). Anyway, it's a different matter than original improvement request.

firebird-automations commented 17 years ago

Commented by: Alexander Tyutik (tut)

Pavel, and what next? Bug marked as closed and no new requests appeared instead of it. And, in general, i didn't understand why don't you like original improvement request. It called "Bad optimization of queries with DB_KEY" and it is really so.

firebird-automations commented 17 years ago

Commented by: @pcisar

Alexander, I closed your improvement request because there is nothing wrong with optimization of data retrieval via DB_KEY is it stands now. Current DB_KEY retrieval method is far much faster than retrieval with use of index (no index scan is needed at all, engine computes the exact row location from DB_KEY value). So from this point, your issue description and request is completely invalid, as there is simply no way how to make the DB_KEY retrieval faster or better.

If there is any issue with retrieval via DB_KEY, then it's in how this access method is reported in execution plan. But you must understand that it's rather a small glitch than real issue, as execution plan in "human readable form" is and always was an abbreviation of real plan used by engine. It shows only most important and common execution paths and methods in condensed form (some element like evals are stripped out).

So, if any improvement request should be made here, then against "RSB tree to text" conversion routine, not the optimizer itself or whatever. If you want to fill such improvement request, then feel free to do so, but I don't have the urge to do so myself.

firebird-automations commented 17 years ago

Commented by: Alexander Tyutik (tut)

> If there is any issue with retrieval via DB_KEY, then it's in how this access method is reported in execution plan.

Sorry, but it's not clear for me.

I asked server in my query to get me one record by one known db_key of that record. And server reported me that in some kind of queries it immediately fetched record by db_key, and in other queries it fetched all records of the table and after that it filtered each record comparing db_keys.

So, do you want to tell that server reported me incorrect amount of fetched records (it really fetched one record but shown 10 in statistics) or what?

It does not matter for me as for user what plan server generated. I don't like that it fetched all records in case when he should fetch only one. That's why for me fetching of all records of a table is a bad optimization.

firebird-automations commented 16 years ago

Commented by: @dyemanov

Pavel, I do see the optimization problem. Access via DBKEY is *always* reported as INDEX() -- without an index name -- but it's never reported as NATURAL. 10 non-indexed reads clearly show that the inner stream is scanned sequentially instead of involving a fast DBKEY lookup.

firebird-automations commented 16 years ago

Commented by: @dyemanov

This is a regression as compared with v1.5.x.

firebird-automations commented 16 years ago
Modified by: @dyemanov status: Closed \[ 6 \] =\> Reopened \[ 4 \] assignee: Dmitry Yemanov \[ dimitr \] resolution: Won't Fix \[ 2 \] =\>
firebird-automations commented 16 years ago
Modified by: @dyemanov Version: 2\.0\.1 \[ 10090 \] Version: 2\.0\.0 \[ 10091 \]
firebird-automations commented 16 years ago
Modified by: @pcisar Workflow: jira \[ 12260 \] =\> Firebird \[ 15596 \]
firebird-automations commented 14 years ago
Modified by: @dyemanov issuetype: Improvement \[ 4 \] =\> Bug \[ 1 \]
firebird-automations commented 14 years ago
Modified by: @dyemanov status: Reopened \[ 4 \] =\> In Progress \[ 3 \]
firebird-automations commented 14 years ago

Commented by: @dyemanov

Just to add a note: regression is only the last statement in the example (with a sub-query). And it looks fixed in v2.5. The second sample (inner join) works equally bad in the every FB version.

firebird-automations commented 14 years ago

Commented by: @dyemanov

One more example, slightly different this time:

select * from rdb$relations where rdb$db_key = ? and rdb$relation_id = 0

select * from rdb$relations where rdb$db_key = ? and rdb$relation_name = 'RDB$RELATIONS'

I'd expect to see the plan using DBKEY only, not an index (even primary) or a bitmap involving two predicates.

firebird-automations commented 14 years ago
Modified by: @dyemanov status: In Progress \[ 3 \] =\> Open \[ 1 \]
firebird-automations commented 14 years ago
Modified by: @dyemanov status: Open \[ 1 \] =\> Resolved \[ 5 \] resolution: Fixed \[ 1 \] Fix Version: 3\.0 Alpha 1 \[ 10331 \]
firebird-automations commented 11 years ago
Modified by: @dyemanov Link: This issue is duplicated by [CORE3997](https://github.com/FirebirdSQL/firebird/issues?q=CORE3997+in%3Atitle) \[ [CORE3997](https://github.com/FirebirdSQL/firebird/issues?q=CORE3997+in%3Atitle) \]
firebird-automations commented 11 years ago

Commented by: @dyemanov

Backported into v2.5.3.

firebird-automations commented 11 years ago
Modified by: @dyemanov Fix Version: 2\.5\.3 \[ 10461 \]
firebird-automations commented 10 years ago
Modified by: @pcisar Link: This issue is related to [QA509](https://github.com/FirebirdSQL/fbt-repository/issues?q=QA509+in%3Atitle) \[ [QA509](https://github.com/FirebirdSQL/fbt-repository/issues?q=QA509+in%3Atitle) \]
firebird-automations commented 9 years ago
Modified by: @dyemanov Link: This issue relate to [CORE4530](https://github.com/FirebirdSQL/firebird/issues?q=CORE4530+in%3Atitle) \[ [CORE4530](https://github.com/FirebirdSQL/firebird/issues?q=CORE4530+in%3Atitle) \]
firebird-automations commented 8 years ago

Commented by: @pcisar

Test created.

firebird-automations commented 8 years ago
Modified by: @pcisar status: Resolved \[ 5 \] =\> Closed \[ 6 \]
firebird-automations commented 8 years ago
Modified by: @pavel-zotov QA Status: No test
firebird-automations commented 8 years ago
Modified by: @pavel-zotov QA Status: No test =\> Done successfully