Closed jovo closed 8 months ago
- similarity - not a distance
Yeah this can be reworded. We currently expose a similarity, or I guess kernel computation among samples, not distances. But we convert this to a distance by just doing:
dists = 1.0 - similarity_matrix_normalized
- NNMetaEstimator - what is a meta-estimator? i've never heard that term. afaict, it is just computing nearest-neighbors, using the forest-based distance? not estimating anything technically? maybe estimating geodesic distance/neighbors?
a meta-estimator is scikit-learn terminology for a class that gets passed in another Estimator
. E.g. this is a WIP for me to implement an API for estimating nearest-neighbors using any arbitrary tree/forest Estimator
as the "base estimator".
- why don't we expose actually computing the distance, the way cencheng says to?
Sure we can do that. Do you have a specific reference to what you are talking about?
I believe the way written is the Cencheng method for distance to kernel transformations. i.e.
dists = 1.0 - similarity_matrix_normalized
In this case, max(K_ij) is 1 since similarities are in [0, 1]
All forest estimators have distance_matrix = 1 - compute_similarity_matrix(X)
"Distance Metrics" is redundant. Distances are Metrics.
also, the two functions under that are not distances or metrics.