Are there any quantitative evaluation methods that can be used to evaluate the performance of the Anchors algorithm on Images?

Hey @krishnakripaj,

So we build Anchors themselves to capture certain performance/accuracy metrics.

The precision/threshold of an anchor is the proportion of instances contained within the Anchor that obtain the same classification. So if you sample from the anchor it's the probability that the sampled instance gets the same classification as the original instance your explaining. Anchors are generated so as to obtain a minimum precision passed as an argument to the AnchorImage class.

We also generate anchors to maximise the coverage of anchors. The coverage is the number of instances in the dataset that are contained within the anchor. In the case of images, this isn’t well defined. The issue is that image anchors are made up of super-pixels generated from the instance of interest and other data points are very unlikely to have those super-pixels so it’s hard to say what other instances are in the anchor. Instead, we generate an artificial dataset from the image and use that instead.

Anchors are quite computationally expensive especially with large numbers of features, hence why we use super-pixels (See interpretable-ml-book for discussion of runtimes). Their runtime is also highly dependent on the data and the instance of interest. For instance, anchors explaining instances next to decision boundaries may take longer to compute. We're planning an experimental exploration of the runtime considerations to hopefully give users an idea of what to expect but haven't started on it yet.

I wonder if you could share more details about your use case?

SeldonIO / alibi

Are there any quantitative evaluation methods that can be used to evaluate the performance of the Anchors algorithm on Images? #637