NeuralNetworkVerification / VeriX

VeriX: Towards Verified Explainability of Deep Neural Networks
BSD 3-Clause "New" or "Revised" License
9 stars 2 forks source link

Regarding the paper experiment on 4.1 #2

Open HCWDavid opened 7 months ago

HCWDavid commented 7 months ago

Hi i am curious on how you generate the counterfactual examples on 4.1 MNIST (specifically digit '1'), could you explain a little bit? When I ran it, I was not able to generate the counterfactual but many images of one pixels highlighted.

minwu-cs commented 5 months ago

Hi @HCWDavid,

Thanks a lot for your interest in our work and also using our tool.

Counterfactuals can be produced by setting plot_counterfactual to True. For instance, if we use the mnist-10x2.onnx model and the x_test[16] example, by setting plot_counterfactual=True, we should be able to plot the original image, the saliency, the explanation, and the counterfactuals.

verix = VeriX(dataset="MNIST",
              image=x_test[16],
              model_path="models/mnist-10x2.onnx")
verix.traversal_order(traverse="heuristic")
verix.get_explanation(epsilon=0.05,
                      plot_counterfactual=True)

I'm attaching the results for this specific case in a zip file for reference purpose: mnist-10x2-index-16-heuristic-linf0.05.zip. For this example, there are 159 counterfactuals generated, corresponding to the explanation size as in explanation-159.png. For each pixel in the explanation, a counterfactual is produced, e.g., counterfactual-at-pixel-12-predicted-as-7.png, which means that by perturbing the 12th pixel together with the irrelevant pixels, the model's prediction can be manipulated from 9 to 7. For all the counterfactuals, we observe that there are 137 out of 159 for prediction 7, 13 out of 159 for prediction 4, and 9 out of 159 for prediction 8.

Therefore, to plot the figures as in Figure 1(b) of the paper, all we need to do is collect these 137 pixel indices and highlight them in yellow to show that the prediction of the image may be manipulated into 7 if not fixing these 137 pixels. Similarly, it may also be manipulated into 4 if not fixing these 13 pixels or into 8 if not fixing these 9 pixels.

While we use this specific example to show how we perform the counterfactual analysis, you are more than welcome to train your own model and test on other images. For an MNIST image, it is not uncommon that all the produced counterfactuals correspond to a certain prediction, which makes sense as, for instance, a handwritten digit 1 can easily be manipulated into 7 than other digits.

Hope this helps. If you have any further questions, please don't hesitate to let us know.

Wish you every success in your study/research goals.