awsm-research / PyExplainer

PyExplainer: A Local Rule-Based Model-Agnostic Technique (Explainable AI)
MIT License
29 stars 9 forks source link

getting error while running "explain" function #19

Closed ShraddhaDevaiya closed 2 years ago

ShraddhaDevaiya commented 2 years ago

Hello, I am trying to use pyexplainer for input as text data of bug reports. I have performed text vectorizer. and after that while performing this step: created_rules = py_explainer.explain(X_explain=X_explain, y_explain=y_explain, search_function='crossoverinterpolation', random_state=0, reuse_local_model=True)

I am getting this error: Traceback (most recent call last): File "sh_project.py", line 49, in created_rules = py_explainer.explain(X_explain=X_explain, File "/mnt/c/Users/shrad/Documents/PyExplainer/pyexplainer/pyexplainer_pyexplainer.py", line 554, in explain synthetic_object = self.generate_instance_crossover_interpolation( File "/mnt/c/Users/shrad/Documents/PyExplainer/pyexplainer/pyexplainer_pyexplainer.py", line 848, in generate_instance_crossover_interpolation X_train_i = X_train_i.loc[:, self.indep] File "/home/shraddha/.local/lib/python3.8/site-packages/pandas/core/indexing.py", line 961, in getitem return self._getitem_tuple(key) File "/home/shraddha/.local/lib/python3.8/site-packages/pandas/core/indexing.py", line 1149, in _getitem_tuple return self._getitem_tuple_same_dim(tup) File "/home/shraddha/.local/lib/python3.8/site-packages/pandas/core/indexing.py", line 827, in _getitem_tuple_same_dim
retval = getattr(retval, self.name)._getitem_axis(key, axis=i) File "/home/shraddha/.local/lib/python3.8/site-packages/pandas/core/indexing.py", line 1191, in _getitem_axis return self._getitem_iterable(key, axis=axis) File "/home/shraddha/.local/lib/python3.8/site-packages/pandas/core/indexing.py", line 1327, in _get_listlike_indexer
keyarr, indexer = ax._get_indexer_strict(key, axis_name) File "/home/shraddha/.local/lib/python3.8/site-packages/pandas/core/indexes/base.py", line 5782, in _get_indexer_strict
self._raise_if_missing(keyarr, indexer, axis_name) File "/home/shraddha/.local/lib/python3.8/site-packages/pandas/core/indexes/base.py", line 5842, in _raise_if_missing
raise KeyError(f"None of [{key}] are in the [{axis_name}]") KeyError: "None of [Index(['bookmarks option',\n '[FIX]Not correctly retrieving post data when saving a page or frame generated from a form POST',\n 'Show URI in status bar onmouseover of Back/Forward menu items',\n 'Localization problems with Bookmarks Sorted By menu & History Sorted By menu',\n 'UI to allow external handlers for internal types',\n 'Mozilla should support X11 session management',\n 'POST result page should not appear in global history or history autocomplete results',\n 'Non-responding Windows UNC share hangs bookmark menu',\n 'want infinite Back', 'Use favicons on webpage shortcuts in Windows'],\n dtype='object', name='Title')] are in the [columns]"

Can anyone please help, what does this error indicate?

MichaelFu1998-create commented 2 years ago

Hi @ShraddhaDevaiya, may I have details of both variables X_explain and y_explain? Thanks!

ShraddhaDevaiya commented 2 years ago

yeah, X_explain and y_explain is like this:

image image
MichaelFu1998-create commented 2 years ago

Hi @ShraddhaDevaiya, How did you construct the PyExplainer object? Did you set the "indep" (feature columns names) and "dep" (label column name) variables? Also, you may follow the "PART B" in this tutorial file https://github.com/awsm-research/PyExplainer/blob/master/TUTORIAL.ipynb let me know if it helps

ShraddhaDevaiya commented 2 years ago

Hello @MichaelFu1998-create, I set "indep" and "dep" variables, and regarding load_sample_data function, in that you have set activemq file. but in my case, it would be according to my dataset right? can you please guide which features you have added in activemq file? because I can see it has different columns than dataset.

MichaelFu1998-create commented 2 years ago

yes @ShraddhaDevaiya, you should load your DataFrame consisting of feature cols and label col instead of using the load_sample_data() function. However, your DF should look like the output of step 1.1 in the TUTORIAL notebook but can have different feature cols with different names. Then you may define your preferred index col like step 1.2 which is optional, after that, you need to define your feature cols and label col as shown in step 1.3, you may comment out the AutoSpearman if you don't need it. Note that in our sample data, the label col is the last col, hence we used y = df.iloc[:, -1] to get all of the labels, in your case, the label col is not necessary to be the last col, make sure you adjust -1 to the index of your label col.

MichaelFu1998-create commented 2 years ago

Feel free to open another issue if still having issues running our package.