SELECT COUNT(*) performance improvement

shunwang / redisql

Automatically exported from code.google.com/p/redisql

0 stars 0 forks source link


"SELECT COUNT(*)"'s performance can be massively improved, it is just
a question of me spending ?1-2 days? coding it.

The current implementation iterates the B+Tree blindly, whereas
"SELECT COUNT(*)" can be done ignoring all the leaves except the ones
containing "min,max"

This requires a special B+tree iterator ... which is complicated and
really should not be rushed.

Currently each B+tree node holds 256 values, so using this iterator
and counting a range query of 20K would involve iterating through less
than 80 nodes where it now goes through all 20K :)

Original issue reported on code.google.com by jaksprats on 5 Nov 2010 at 9:07

There has been a huge improvement w/ the introduction of the XthIteratorFind() iterator, but one more optimisation where num-descendants is stored in each node will provide another BIG performance boost

shunwang / redisql

SELECT COUNT(*) performance improvement #19