PAIR-code / what-if-tool

Source code/webpage/demos for the What-If Tool
https://pair-code.github.io/what-if-tool
Apache License 2.0
895 stars 167 forks source link

Can the attribution values be visualized in Tensorboard mode? #62

Closed tanmayg closed 4 years ago

tanmayg commented 4 years ago

I worked with the Notebook mode and was successfully able to project attribution scores (using Shapely algorithm) on WIT dashboard. Due to a bigger data size, I then tried the visualization in the Tensorboard mode. The instructions given on the documentation page, mentions only two requirements: 1. ML model in TF serving format and, 2. TFRecord file of the example data set. There isn't any mention of generating or uploading attribution values (generated by Google Integrated Gradient or SHAP) in the Tensorboard mode. Please suggest if it's possible to add attribution values in Tensorboard mode or am I missing anything.

jameswex commented 4 years ago

Thanks for reaching out Tanmay and glad you were able to get attributions working with WIT in notebook mode. Unfortunately, WIT in TensorBoard doesn't currently support attributions. This is because WIT in TensorBoard queries the models through TF Serving's PredictionService API, which only returns prediction results from the served model, and not other custom calculated values such as shap values. We rely on the model being queried to provide the attribution results (just as you did with a custom prediction fn when using attributions in notebook mode). When you mention wanting to use TB mode due to a bigger data size, what exactly do you refer to? Are you able to load more data points when using WIT in TB vs in notebook mode, or were you just experimenting with the limits of each mode?

tanmayg commented 4 years ago

Thanks a lot for replying and clarifying James, really appreciate your prompt response. When I say bigger data size, I meant a test dataset of 100000 data points and ~2000 features. In the notebook mode, I experimented with a subset of my dataset of 500 data points and 50 features and was able to get the desired results. But when I triggered the WIT tool with complete dataset, the kernel kept dying repeatedly, so I thought it might be memory issue for notebook and thus wanted to experiment with the dataset in Tensorboard mode. Please suggest for better ways to handle bigger sets for WIT, if possible. Regards.

jameswex commented 4 years ago

WIT does have limits on the amount of datapoints it can handle, which changes based on the size of each datapoint. This is because WIT sends all datapoints to the frontend for interactive analysis.

I'm curious how many data points you are able to use it with if you keep your same number of features but decrease the number of datapoints, and vice versa, if you are able to provide that data.

In general, for large-scale analysis, I would suggest tools that do offline processing, such as TensorFlow Model Analysis (https://www.tensorflow.org/tfx/model_analysis/install). That tool can be used in conjunction with WIT in colab/jupyter, where you find interesting data slices with TFMA, and then can load samples from that slice in WIT for further analysis. See https://ai.googleblog.com/2019/12/fairness-indicators-scalable.html for details on using TFMA's FairnessIndicators together with WIT.

jameswex commented 4 years ago

WIT does have limits on the amount of datapoints it can handle, which changes based on the size of each datapoint. This is because WIT sends all datapoints to the frontend for interactive analysis.

I'm curious how many data points you are able to use it with if you keep your same number of features but decrease the number of datapoints, and vice versa, if you are able to provide that data.

In general, for large-scale analysis, I would suggest tools that do offline processing, such as TensorFlow Model Analysis (https://www.tensorflow.org/tfx/model_analysis/install). That tool can be used in conjunction with WIT in colab/jupyter, where you find interesting data slices with TFMA, and then can load samples from that slice in WIT for further analysis. See https://ai.googleblog.com/2019/12/fairness-indicators-scalable.html for details on using TFMA's FairnessIndicators together with WIT.

tanmayg commented 4 years ago

Thanks again James for your inputs. Really helped me a lot. I'm closing the issue now as I have a direction to go and experiment. I have another concern w.r.t 3d inputs in google explain for which I would open a new issue.