PAIR-code / what-if-tool

Source code/webpage/demos for the What-If Tool
https://pair-code.github.io/what-if-tool
Apache License 2.0
907 stars 169 forks source link

Error with sklearn model in What-if tool performance tab #150

Closed blah-crusader closed 3 years ago

blah-crusader commented 3 years ago

Hi everyone! Nice tool !

I am able to get the datapoint editor to work but for the performance tab, I am getting an error. I believe my prediction function is adapted accordingly, but on top I get the error :

"TypeError("unhashable type: 'list'")"

My custom prediction function takes raw input data, transform it to numeric, and then outputs a numpy array of predictions for each class:

def adjust_prediction(z): testing_data=pd.get_dummies(z, columns=categorical, drop_first=True) pred = lr.predict_proba(testing_data) return pred

My widgetconfig looks like this:

config_builder = (WitConfigBuilder(test_examples.tolist(), X_test.columns.tolist() + ["target"]) .set_custom_predict_fn(adjust_prediction) .set_target_feature('target') .set_model_type('classification') .set_label_vocab(["Positive", "Negative"])) WitWidget(config_builder, height=tool_height_in_px)

Does anyone have an idea how to fix this?

jameswex commented 3 years ago

If you print out the "pred" value in your custom predict function before it returns, what does it look like for a small set of like 3 test examples? Is it a simple 2D list of class scores, with dimensions of (num_examples, 2) like [[.4, .6], [.2, .8], [1, 0]]?

blah-crusader commented 3 years ago

Hi James, thanks for looking into this. Here is a picture of what my pred function returns, it is a numpy array with dimensions (num_examples, 2). I can try later on to convert it to a simple list of lists without the numpy array structure around it and will let you know if this fixes the issue!

Thanks!

blah-crusader commented 3 years ago

No, adding pred.tolist() in the return of the prediction function does not seem to resolve the problem, while the data structure is now exactly as you described. Do you see any other areas that I could check?

Thanks!

jameswex commented 3 years ago

Thanks! Other questions: What version of witwidget do you have installed and what environment are you running WIT in (colab, jupyter version x, jupyterlab version x?)

blah-crusader commented 3 years ago

I am running it on a jupyter notebook. As info, the widget did not work on jupyter lab, but it did on jupyter notebook. Secondly, regarding your question not sure if all these are relevant, but using pip list I get these ones related to Jupyter: jupyter 1.0.0 jupyter-client 6.1.7 jupyter-console 6.2.0 jupyter-core 4.6.3 jupyterlab 2.2.6 jupyterlab-pygments 0.1.2 jupyterlab-server 1.2.0

And the following version of witwidget: witwidget 1.7.0

jameswex commented 3 years ago

FYI: Latest version witwidget 1.8.0 should work in Jupyterlab 2.x (may need to do all the witwidget install instructions again, not just updating the pip package).

w.r.t your error, I'll be able to debug more tomorrow. Thanks for the info.

blah-crusader commented 3 years ago

Thanks a lot!

jameswex commented 3 years ago

Are you able to share a notebook that repro's the problem, or is the data not able to be shared? I'm wondering if the issue has to do with the format of your datapoints possibly.

blah-crusader commented 3 years ago

Yes, I am testing it on the open Adult dataset from UCI, from https://archive.ics.uci.edu/ml/datasets/adult . I've attached a notebook that uses the dataset where I'm experiencing the issue!

What_IF_Tool_Logistic_Regression.zip

jameswex commented 3 years ago

Thanks! That helped a lot.

The issue was that the adjust_prediction function assumed the input z was a dataframe, but in WIT, the custom prediction function gets raw lists as input (such as [[27, 'White'], [45, 'White']]).

So if you change your function to have the lines:

z_df = pd.DataFrame(z, columns = ['age', 'race'])
testing_data=pd.get_dummies(z_df, columns=categorical, drop_first=True)

then it should work in WIT.

Let me know if that works for you.

blah-crusader commented 3 years ago

Yeees, it does :) Great, thanks for the support!