BinaryAnalysisPlatform / bap

Binary Analysis Platform
MIT License
2.07k stars 273 forks source link

switches to patricia trees in the KB implementation #1415

Closed ivg closed 2 years ago

ivg commented 2 years ago

The patricia tree delivers faster inserts (since there's no need for rebalancing) and even lookups. The fast merges are not really used yet, but having patrica trees opens nice opportunties for parallelization in the future. The overall improvement is something about 15% and there's also some improvement in the memory footprint. This change also removes KB operations from the critical path, so further optimization of the KB representation deems unnecessary at this time.

We use patricia trees (from Fast Mergeable Integer Maps (1998), 1) for all finite mappings from objects. The values themselves are still represented with the custom AVL tree.

We use big-endian trees with specially encoded keys, so that the top 6 bits are used to encode the branching bit (its number) and the lower 57 bits are used to represent the key itself. This saves a word per each branch of the tree.