FirebirdSQL / firebird

Firebird server, client and tools
https://firebirdsql.org
1.26k stars 217 forks source link

New index structures to eliminate data page reads [CORE5909] #6167

Open firebird-automations opened 6 years ago

firebird-automations commented 6 years ago

Submitted by: @livius2

Votes: 1

Will be good to see index only operations in situation like this:

table with index on SOURCE_ID and the query

SELECT GS.SOURCE_ID, COUNT(*) FROM GAIA_SOURCE GS GROUP BY GS.SOURCE_ID HAVING COUNT(*)>1

should go throught index only without visiting table data pages

it is important when working with big databases e.g. index size can be relatively small compared to table itself. In my real database index have only 18GB but table have 720GB. As you can see accessing only 18GB is not the problem - but accessing 720GB is.

same in other places join between tables (using indexed fields) without returning any other field values from some joined tables. I suppose this is big change and require REAL TRANSACTIONAL INDEXES. Current indexes benefit only for intensive updating application.

firebird-automations commented 6 years ago
Modified by: Sean Leyne (seanleyne) description: Will be good to see index only operations in situation like this: table with index on SOURCE\_ID and the query SELECT GS\.SOURCE\_ID, COUNT\(\*\) FROM GAIA\_SOURCE GS GROUP BY GS\.SOURCE\_ID HAVING COUNT\(\*\)\>1 should go throught index only without visiting table data pages it is important when working with big databases e\.g\. index size can be relatively small compared to table itself\. In my real database index have only 18GB but table have 720GB\. As you can see accessing only 18GB is not the problem \- but accessing 720GB is\. same in other places join between tables \(using indexed fields\) without returning any other field values from some joined tables\. I suppose this is big change and require REAL TRANSACTIONAL INDEXES\. Current indexes benefit only for intensive updating application\. =\> Will be good to see index only operations in situation like this: table with index on SOURCE\_ID and the query SELECT GS\.SOURCE\_ID, COUNT\(\*\) FROM GAIA\_SOURCE GS GROUP BY GS\.SOURCE\_ID HAVING COUNT\(\*\)\>1 should go throught index only without visiting table data pages it is important when working with big databases e\.g\. index size can be relatively small compared to table itself\. In my real database index have only 18GB but table have 720GB\. As you can see accessing only 18GB is not the problem \- but accessing 720GB is\. same in other places join between tables \(using indexed fields\) without returning any other field values from some joined tables\. I suppose this is big change and require REAL TRANSACTIONAL INDEXES\. Current indexes benefit only for intensive updating application\.
firebird-automations commented 6 years ago

Commented by: Sean Leyne (seanleyne)

Karol,

I understand the goal you are trying to describe but the Summary of the issue is ... useless. far too vague to provide any meaning to third-parties.

Please suggest an alternate description (eg. New index structures to eliminate data page reads).

firebird-automations commented 6 years ago

Commented by: @livius2

@Sean,

feel free to change title and description as my English is not so good

firebird-automations commented 6 years ago
Modified by: Sean Leyne (seanleyne) summary: Index only operations =\> New index structures to eliminate data page reads
firebird-automations commented 6 years ago

Commented by: Ann Harrison (awharrison)

Yes this is a big job and will significantly increase index size and degrade performance in many cases. Each index entry will need to identify the transaction number of the transaction that created the entry and the transaction that invalidated it. There are optimizations like dropping the identy of the creatimg index when an entry is fully mature and visible to all transactions but they cause even more index updates.

On the other hand, this is not an "either or" choice and could be implemented as a new index type, dropping the optimization choice in the designers lap.

firebird-automations commented 6 years ago

Commented by: @pavel-zotov

> will significantly increase index size and degrade performance in many cases Anyone did benchmark to compare performance ? (may be there is/was special build with such feature like Karol asks?)

> need to identify the transaction number When database is read-only (e.g. replication target or other DB for reports/OLAP) then we can return key from index immediately, without knowing its transaction creator. Isn't ?

firebird-automations commented 6 years ago

Commented by: @pavel-zotov

PS. When user does any kind of updates with GTT that was created 'on commit delete' then it is also no need to extract info from data pages if it is found in index key.

firebird-automations commented 5 years ago

Commented by: @livius2

if you consider implementing new index structure take also into account: 1. Bi directorial navigation 2. Different navigations in compound index first field asc second desc (maybe not alowed in bi directorial indexes) 3. partial navigation e.g. compound index (Field1, Field2, Field3) and in the where clause you specify ony (Field1 and Field3) or only FIeld3 and it should use also index (this work in PostgresSQL)