When processing larger datasets bed_closest sometimes reports the incorrect interval due to complexities in choosing the correct search path in the interval tree. An approach could be written to parse the interval tree, but i believe it will require querying multiple search paths through the tree to ensure correctness. I've instead rewritten bed_closest to use a binary search approach, which has similar performance, and isn't as complicated to debug/maintain. This approach was inspired by the IRanges implementation written mostly in R.
When processing larger datasets
bed_closest
sometimes reports the incorrect interval due to complexities in choosing the correct search path in the interval tree. An approach could be written to parse the interval tree, but i believe it will require querying multiple search paths through the tree to ensure correctness. I've instead rewrittenbed_closest
to use a binary search approach, which has similar performance, and isn't as complicated to debug/maintain. This approach was inspired by the IRanges implementation written mostly in R.Created on 2023-04-07 with reprex v2.0.2