marcotcr / lime

Lime: Explaining the predictions of any machine learning classifier
BSD 2-Clause "Simplified" License
11.64k stars 1.81k forks source link

Applying to LIME to tree based data #638

Open rupjit-bo opened 3 years ago

rupjit-bo commented 3 years ago

Awesome work on the library. I have discovered it recently and it seems to have a lot of good features. I wanted to try out LIME for tree structured data (ASTs). I have trained a model called TBCNN. Now I want to check which nodes are the ones causing the prediction to be 1 or 0. I cannot use the method described here

def new_predict(x): data = strings_to_embeddings(x) return model.predict_proba(data) In my case x would be a list of nested dictionaries.

eg.

[{'node': 'ci_root', 'children': [{'node': 'ci_class_decl', 'children': [{'node': 'ci_modifiers', 'children': [{'node': 'ci_modifier', 'children': []}]}, {'node': 'ci_type', 'children': [{'node': 'SimpleType', 'children': [{'node': 'SimpleName', 'children': []}]}]}, ... ] NOTE: This is just a part of one data point.

So to sum up, I have two questions 1) Is it possible to use LIME on such data? 2) If yes, please provide some hints on how it can be done.

Thank you