SeldonIO / alibi

Algorithms for explaining machine learning models
https://docs.seldon.io/projects/alibi/en/stable/
Other
2.42k stars 252 forks source link

Which anchor explainer to use for different types of input features #741

Open erkinaltuntas opened 2 years ago

erkinaltuntas commented 2 years ago

Hello together,

I have a ML (classification) model which uses different types of input features, i.e. numerical features but also text features (which are processed by doc2vec). I am now struggling whether the anchor functionality works on this kind of problem. Do I have to use the AnchorText or AnchorTabular?

jklaise commented 2 years ago

Hi, for now the anchor algorithms only support tabular or text data separately but not both. We're looking into multi-modal explanation methods to support such use cases in the future.

That being said, if the input to your model is a concatenation of tabular features + text vectors you could try to use AnchorTabular, but just be warned that it does not scale well with the number of features (I assume the text embeddings are quite high-dimensional). Also, the output will likely not be interpretable because an anchor for a text embedding vector would not necessarily correspond to any "words" in the natural language space (because with doc2vec and word2vec you map variable length phrases to fixed-length vectors, so the inverse mapping may not exist). For example, if the embedding dimension is 100 and your anchor says that the first 10 dimensions are important to keep fixed, how would you map this space of "fixed first 10 dimensions and the rest of the 90 dimensions can vary freely" to a set of words?