Closed zkurtz closed 4 years ago
I definitely plan to add it in a future version, but it won’t happen soon (at least not this month).
If you really need it for some reason, the way to add it would be to modify the C++ functions increase_comb_counter
called from traverse_tree_sim
and traverse_hplane_sim
- basically need to add a new increase_comb_counter
which wouldn’t iterate over all pairs, but only over the desired combinations, which would have to be set through row indices in ix_arr
being above or below some number (e.g. the X
groups having the earlier numbers and the Y
groups having the later ones), plus a new ‘if’ condition that returns from *_sim
if there are only observations of one group.
(It’s quite a lot of work though)
This is now implemented in the master branch.
predict_distance
appears to currently support only returning O(n^2) distances, which is not scalable. Could you add an option to pass in two data frames, (X=n x k, Y=m x k) instead of one, such that the returned distances are of dimensionality n x m? Even a one-versus-all option would be very useful.