This is a library of in-memory indexes. They are used in the TinSpin TinSpin project. The library includes:
TinSpin indexes are also available via maven:
<dependency>
<groupId>org.tinspin</groupId>
<artifactId>tinspin-indexes</artifactId>
<version>2.1.4</version>
</dependency>
The indexes fall into four categories, depending on whether they use points or (axis-aligned) boxes as keys and whether they are maps or multimaps. This results in four base types with respective Java interfaces:
PointMap
is supported by CoverTree
, KDTree
, PHTreeP
, PointArray
, QuadtreeKD0
, QuadtreeKD
, QuadtreeKD2
and RTree
(via PointMapWrapper
)PointMultimap
is supported by KDTree
, PHTreeMMP
, PointArray
, QuadtreeKD0
, QuadtreeKD
, QuadtreeKD2
and RTree
(via PointMultimapWrapper
)BoxMap
is supported by PHTreeR
, BoxArray
, QuadtreeKD0
, QuadtreeKD
and RTree
BoxMultimap
is supported by BoxArray
, QuadtreeKD0
, QuadtreeKD
and RTree
Indexes can be created via factories in the interfaces, e.g. PointMap.Factory.createKdTree(...)
.
WARNING The Map
implementations are mostly not strict with respect to unique keys. That means they work fine if keys are unique. However, they may not enforce uniqueness (replace entries when the same key is added twice) and instead always add another entry. That means they may effectively act as multimaps. At the moment, only PH-Tree based indexes enforce uniqueness and properly overwrite existing keys.
Note:
....Factory.createAndLoadStrRTree(...)
.PointArray
and BoxArray
are simple array based implementations. They scale badly with size, their only use is for verifying correctness of other indexes. See CHANGELOG for details.
remove()
not cleaning up properly and kNN returning deleted entries.create()
method static
for IndexConfig
.create()
method for IndexConfig
.Some hints to improve performance:
reset()
method of iterators to reuse them instead of creating (complex) iterator objects for each query. This should reduce garbage collection load. IndexConfig
. "Defensive copying" creates a copy of all double[]
when inserted into the tree. Avoiding this copy may slightly improve performance and garbage collection but makes the tree more
vulnerable to inconsistencies when modifying the key externally. Other indexes may also become inconsistent,
but it is more severe for kD-tree because they use keys as positions for nodes. A Critical Bit tree for k-dimensional or arbitrary length keys. (Also called: binary patricia trie, radix-tree, ...)
Current version:
This is a Java implementation of a crit-bit tree. A crit-bit tree is a Patricia-Trie for binary data. Patricia-Tries achieve space efficiency by using prefix sharing. They are also update efficient because they are 'stable' trees, meaning that any update will affect at most two nodes.
Unlike other crit-bit trees, this tree also supports multi-dimensional data by interleaving the bits of each dimension and allowing k-dimensional queries on the tree.
Alternatively it supports 1-dimensional keys with arbitrary length (>64 bit).