Open colorlace opened 11 months ago
Created unit test that shows a mismatch between top k mask and the location of perturbation in this fork.
As the comment in the code snippet below states, the features in the top-K are kept static. For PGI we want the opposite. We want the features in the top-K to be perturbed and the rest to remain static.
# keeping features static that are in top-K based on feature mask perturbed_samples = original_sample * feature_mask + perturbations * (~feature_mask)
(code snippet from
NormalPerturbation.get_perturbed_inputs
inexplainers/catalog/perturbation_methods.py
)
Thank all for pointing out the issues!
@colorlace is correct that for PGI we want to perturb the top-K features. The code in evaluator.py is correct and supports this, it solely depends on what mask is passed in via the input_dict variable.
The convention is that the mask should contain 0s for the top-K features and 1s for non-top-K features (so that perturbed_samples matches original_sample where feature_mask is high)
We have updated the comment to say: "keeping features static where the feature mask is high"
Created unit test that shows a mismatch between top k mask and the location of perturbation in this fork.
@tazitoo thanks for the unit test implementation, it's very helpful. The problem is the generate_mask function.
We have updated the generate_mask function to set topk features to 0 in the mask and features outside the topk to 1. This way, in the code snippet @colorlace provided, perturbed_sample will be equal to original_sample for features outside the topk.
We have also updated the function to consider absolute value when computing top-k.
As for the unit test, it still won't pass currently, since we actually want:
assert mask.sum() == len(x) - topk
The mask should be high for features outside the topk i.e. it should be True for each feature being 'masked'.
Once making that change, it passes the tests on my end, including when we have negative feature attributions. Thanks again for pointing out this issue!
Thanks for the reply. If there was a bug in the mask function, then the results in the paper and leader board were erroneous? ...it would have resulted in PGI and PGU being inverted...?
As the comment in the code snippet below states, the features in the top-K are kept static. For PGI we want the opposite. We want the features in the top-K to be perturbed and the rest to remain static.
(code snippet from
NormalPerturbation.get_perturbed_inputs
inexplainers/catalog/perturbation_methods.py
)