ModelOriented / DALEX

moDel Agnostic Language for Exploration and eXplanation
https://dalex.drwhy.ai
GNU General Public License v3.0
1.38k stars 166 forks source link

[fastai] 'DataFrame' object has no attribute 'to_frame' with fastai #523

Open ming-cui opened 2 years ago

ming-cui commented 2 years ago

I'm trying to wrap a fastai tabular learner with DALEX. I got 'DataFrame' object has no attribute 'to_frame' error with dx.Explainer(learn, xs, y, label = "Deep NN"). Any potential problems with this line of code? Thanks!

hbaniecki commented 2 years ago

What is the verbose output from the Explainer?

image

ming-cui commented 2 years ago

Please see the screenshot. Snipaste_2022-08-02_10-21-26

hbaniecki commented 2 years ago

Thanks, it is required to pass predict_function to the Explainer, which inputs (model, pandas.DataFrame) and outputs a 1d numpy array.

An example for the xgboost package is available at https://dalex.drwhy.ai/python-dalex-xgboost.html.

The already implemented predict_functions (e.g. for h2o, pycaret, tensorflow) are available at https://github.com/ModelOriented/DALEX/blob/master/python/dalex/dalex/_explainer/yhat.py.

We could add native support for fastai to dalex -- feel free to initiate a PR with changes to the yhat.py file, and I would try to make it happen.

ming-cui commented 2 years ago

Thanks, Hubert. fastai .predict method returns three objects. The source code is Snipaste_2022-08-02_21-11-34 Is it not compatible with DALEX? I'm still getting errors, and the messages were lengthy.

hbaniecki commented 2 years ago

Yes, as I said, you need to create a new function, which will use the predict method and return only one object -- a 1-dimensional numpy array with predictions for a class of interest. Then pass it to the predict_function parameter in dx.Explainer

ming-cui commented 2 years ago

Hi Hubert, I think I got the basic idea, but I struggled for quite a while as I'm new to Python and fastai. I got the error as illustrated in the screenshot. It seems the .predict function only takes a row as input. I guess I need to write some customized prediction function to pass into DALEX. I've linked the source code of .predict here, or you can view the code in the sceenshot. Could you help with getting the right prediction function to work with DALEX? Thank you so much! Error prediction function

.predict source code

hbaniecki commented 2 years ago

Hi! Maybe you can try predict_function = lambda m, d: m.predict(d)[2]?

ming-cui commented 2 years ago

Thanks for the function. I got the error message 'DataFrame' object has no attribute 'to_frame' with details below. Will the .get_preds() method be useful (shown in the photo in my last reply)? .get_preds() returns tuples (one for predictions and the other for true labels), and I haven't figured out how to put it in a explainer. Snipaste_2022-08-03_20-59-07 .

ming-cui commented 2 years ago

Is it possible that the tensor format predictions returned by .predict() caused 'DataFrame' object has no attribute 'to_frame'?

hbaniecki commented 2 years ago

Yes, you need to first apply the predict_function outside of the explainer and see what it returns.

ming-cui commented 2 years ago

It returns a tensor. For example, learn.predict(xs_nn.iloc[0])[2] will return tensor([1.7050]) as the predicted value based on the first row of the training dataframe xs_nn. What should I do with that? Many thanks!

hbaniecki commented 2 years ago

It should return a one dimensional numpy array