Open c-blake opened 4 years ago
Well, I have an impl over in https://github.com/c-blake/adix of this. The usual B-Tree invariants are violated in the course of this initial phase and require a final touch up at the end. So, for example in Unix shell script-ese (I use Zsh , but it would probably work in Bash, too) you can do:
cd adix/tests
nim c ppss
nim c btshell
(for i in {1..100}; do echo A1 $i 0; done; echo D1 0; echo c; echo p)| ./ppss | ./btshell | less -R # -R for color structure highlighting
The A1 $i 0
s do the initial phase bulk add (with zero extra slots) while the D1 0
does the finalization/B-Tree invariant restoring part. You can also do A1 $i 1
and D1 1
to do only 18, not 19. (The default is a 10..20 link tree.) That could help avoid some splits if you think you might add a few elements here & there after initial construction but still get you 18/19 = 95% utilization. I suspect, but do not know for sure, that this algorithm can be generalized to allow any valid amount of spare space (well, less than half to satisfy the B-tree invariants anyway).
You'd need >~ 19*19=361 objects to start to see a third level which actually isnt' so bad if you pipe to a pager like less, but you could also just compile btshell
with -d:btTall
to get more structure sooner. Tall is often more useful for debugging than benchmarking/actual use.
B-Trees have efficient algorithm(s) to populate a tree for later modification from a set of entries already in-order (or in-reverse order). The algorithm is fast both in constant factors and asymptotically and creates a dense tree. https://en.wikipedia.org/wiki/B-tree has details.