lmcinnes / umap

Uniform Manifold Approximation and Projection
BSD 3-Clause "New" or "Revised" License
7.39k stars 803 forks source link

Possible to backprop through UMAP encoding/sensitivity analysis? #555

Open NMRobert opened 3 years ago

NMRobert commented 3 years ago

Hi,

If I constructed a pipeline consisting first of a UMAP embedding R^N->R^M followed by some differentiable mapping R^M->R1 (say, in the case of a simple binary classifier), is there a way to backprop 'through' the encoder such that one could perform something like integrated gradients for sensitivity analysis/feature attribution?

I'm very curious as to what people are doing for explainability when using UMAP. Of course it is always possible to do some sort of perturbation sensitivity analysis or kernelSHAP (etc), but this tends to not be super computationally efficient.

Thank you!

jc-healy commented 3 years ago

Hi there,

I'm always excited to see people trying to come up with better ways to handle expandability (especially in unsupervised cases). Have you had a look at the new parametric UMAP? It uses a neural network to learn a mapping from the original space to a low dimensional one using the UMAP objective function. It sounds like it's what you would need to get started with this project. Check out the read the docs page here: https://umap-learn.readthedocs.io/en/latest/parametric_umap.html And the paper over on arxiv: https://arxiv.org/abs/2009.12981

On Tue, Jan 19, 2021 at 5:31 AM NMRobert notifications@github.com wrote:

Hi,

If I constructed a pipeline consisting first of a UMAP embedding R^N->R^M followed by some differentiable mapping R^M->R1 (say, in the case of a simple binary classifier), is there a way to backprop 'through' the encoder such that one could perform something like integrated gradients for sensitivity analysis/feature attribution?

I'm very curious as to what people are doing for explainability when using UMAP. Of course it is always possible to do some sort of perturbation sensitivity analysis or kernelSHAP (etc), but this tends to not be super computationally efficient.

Thank you!

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/lmcinnes/umap/issues/555, or unsubscribe https://github.com/notifications/unsubscribe-auth/AC3IUWRVQZTYTBVUS6ASS7DS2VNOZANCNFSM4WIR7CZQ .

NMRobert commented 3 years ago

@jc-healy Thank you, that looks really promising. I'll have a go at it and see if I can create some illustrative examples :) I'm not sure if it would be more appropriate to close this github issue, but please feel free to do so if it will help keep things tidy.