outbrain / outrank

A Python library for efficient feature ranking and selection on sparse data sets.
https://dl.acm.org/doi/10.1145/3604915.3610636
BSD 3-Clause "New" or "Revised" License
19 stars 3 forks source link

Around 5x faster MI #50

Closed SkBlaz closed 1 year ago

SkBlaz commented 1 year ago

Turns out np.roll is much more expensive than initially envisioned. By unrolling this (no pun intended) we can get the same result in around 5-6x less time. This results in substantial speedups for longer runsm pairwise comparisons, and more complex derivations such as 3mr. Adding also uint indices as apparently there is no overflow checking in this case.