karchern commented 3 years ago

Summary

currently, after fitting a LMNN model, one can obtain the M matrix via get_mahalanobis_distance(). This allows a user to manually calculate pairwise transformed distances between samples (via equation (2) in the original paper). In order to get transformed "raw" data (i.e. transformed values in a sample by feature matrix), one needs the L matrix. This L matrix can be calculated via M, but apparently it's not possible to do this without introducing errors (see here).

Use Cases

This would allow users to recapitulate the transformation of raw data outside the scikit learn context, which is useful.

Message from the maintainers:

Want to see this feature happen? Give it a 👍. We prioritise the issues with the most 👍.

wdevazelhes commented 2 years ago

Hi @karchern, sorry for the late reply, I think there are several ways you could obtain the transformed samples, without errors: 1) You can use the .transform(X_new) method of your estimator on some new data array X_new 2) If you want to get the trained L matrix as such, you can actually access it directly: it's the components_ attribute of the metric learner (e.g. my_metric_learner.components_), that you can access after training 3) If you want to obtain the L matrix from the M matrix, you can also use our utils function components_from_metric, it's a helper function we designed to try deal with the cases where errors happen using np.linalg.cholesky (because some eigenvalues are very close or equal to zero for instance): in case np.linalg.cholesky doesn't work, we compute the decomposition through another method, computing the eigendecomposition https://github.com/scikit-learn-contrib/metric-learn/blob/7eef7c6f9f376ad6f482369519e04d9062adc31d/metric_learn/_util.py#L377

I hope this helps,

karchern commented 2 years ago

this surely was helpful, thanks William!

scikit-learn-contrib / metric-learn

add method to write out L matrix in LMNN #320

Summary

Use Cases