shunwang / redisql

Automatically exported from code.google.com/p/redisql
0 stars 0 forks source link

SELECT COUNT(*) performance improvement #19

Open GoogleCodeExporter opened 9 years ago

GoogleCodeExporter commented 9 years ago

"SELECT COUNT(*)"'s performance can be massively improved, it is just
a question of me spending ?1-2 days? coding it.

The current implementation iterates the B+Tree blindly, whereas
"SELECT COUNT(*)" can be done ignoring all the leaves except the ones
containing "min,max"

This requires a special B+tree iterator ... which is complicated and
really should not be rushed.

Currently each B+tree node holds 256 values, so using this iterator
and counting a range query of 20K would involve iterating through less
than 80 nodes where it now goes through all 20K :) 

Original issue reported on code.google.com by jaksprats on 5 Nov 2010 at 9:07

GoogleCodeExporter commented 9 years ago
There has been a huge improvement w/ the introduction of the XthIteratorFind() 
iterator, but one more optimisation where num-descendants is stored in each 
node will provide another BIG performance boost

Original comment by jaksprats on 12 Jan 2011 at 10:36