Closed karchern closed 2 years ago
Hi @karchern, sorry for the late reply, I think there are several ways you could obtain the transformed samples, without errors:
1) You can use the .transform(X_new)
method of your estimator on some new data array X_new
2) If you want to get the trained L
matrix as such, you can actually access it directly: it's the components_
attribute of the metric learner (e.g. my_metric_learner.components_
), that you can access after training
3) If you want to obtain the L
matrix from the M
matrix, you can also use our utils function components_from_metric
, it's a helper function we designed to try deal with the cases where errors happen using np.linalg.cholesky
(because some eigenvalues are very close or equal to zero for instance): in case np.linalg.cholesky
doesn't work, we compute the decomposition through another method, computing the eigendecomposition
https://github.com/scikit-learn-contrib/metric-learn/blob/7eef7c6f9f376ad6f482369519e04d9062adc31d/metric_learn/_util.py#L377
I hope this helps,
this surely was helpful, thanks William!
Summary
currently, after fitting a LMNN model, one can obtain the M matrix via get_mahalanobis_distance(). This allows a user to manually calculate pairwise transformed distances between samples (via equation (2) in the original paper). In order to get transformed "raw" data (i.e. transformed values in a sample by feature matrix), one needs the L matrix. This L matrix can be calculated via M, but apparently it's not possible to do this without introducing errors (see here).
Use Cases
This would allow users to recapitulate the transformation of raw data outside the scikit learn context, which is useful.
Message from the maintainers:
Want to see this feature happen? Give it a 👍. We prioritise the issues with the most 👍.