dcjones / coitrees

A very fast interval tree data structure
MIT License
111 stars 8 forks source link

Comparing to Implicit Interval Tree with Interpolation Index (iitii) #3

Open lh3 opened 4 years ago

lh3 commented 4 years ago

@mlin implemented iitii, which is also inspired by cgranges. Just wonder how coitrees and iitii are compared to each other.

mlin commented 4 years ago

iitii is sensitive ("adaptive", if I'm spinning it) to the data; it's easy to come up with scenarios that revert it to cgranges equivalence. Though those may be rare in practice, coitrees has a constant factor advantage in the worst case.

If the data are compliant then iitii potentially has an asymptotic (in dataset size) advantage in the average case, since it skips search iterations. But the predicate is a strong one, so it's not a clear win. I've spent a little time trying to come up with a way to combine the ideas in iitii and coitrees, but nothing to speak of yet. It would help if it were possible to write closed or more-closed formulae for the van Emde Boas indices.

I don't know if I'd recommend iitii for practical usage anytime soon, it's more of an exploratory side-project for me. The indexing scheme in GenomicSQLite is not as fast, but has many other (overwhelming?) practical utilities.

mlin commented 4 years ago

Specifically on van Emde Boas indexing, given the overall tree geometry, can one figure in constant time, the array index of the i'th node (from the left) on tree level L? iitii is just regressing the interval positions on that.

lh3 commented 4 years ago

Specifically on van Emde Boas indexing, given the overall tree geometry, can one figure in constant time, the array index of the i'th node (from the left) on tree level L?

@mlin #2 has some related discussions.