fzi-forschungszentrum-informatik / TSInterpret

An Open-Source Library for the interpretability of time series classifiers
BSD 3-Clause "New" or "Revised" License
115 stars 8 forks source link

Features returned from the Captum package does not match the Captum output #56

Closed shrezaei closed 5 months ago

shrezaei commented 6 months ago

Describe the bug The attribution returned by Saliency_PTY did not make any sense to me. So, I checked with the original Captum package and I found out that the Captum output is different than the attribution returned by TSInterpret. Here is the output of TSinterpret TSInterpret Here is the output of Captum. Captum

The dataset is synthetic. Just a portion of Sine wave (distinguishable feature) added to a random noise. So, it is clear that the attribution of the TSInterpret is incorrect. In this example I used IG. But, this is the case for others that I tried as well.

I hope it helps you fix the issue.

JHoelli commented 6 months ago

Hi @shrezaei,

would you might sharing your parametrization for both captum and TSInterpret? A reason can be the different baslining behavior implemented as default.

captum uses for IG zeros as baseline default. We use random noise, as zeros are contain information in time series and therefore do not provide masking. (More details here: https://github.com/fzi-forschungszentrum-informatik/TSInterpret/issues/33)

If you want to have a similar behavior of TSInterpet to captum you can provide the baseline in the **kwargs of the explain call: int_mod.explain(item,labels=label,baseline_Single= np.zeros(item.shape), assignment=0)

I Hope this helps to solve the issue.

shrezaei commented 6 months ago

Thanks for your quick response. For other to use: Note that "baseline_Single" should be lower case, other wise it doesn't take the baseline into account. Use "baseline_single=". Unfortunately, it didn't solve the issue. I still get a meaningless attribution which is very different from Captum. Here is how I call the function. exp = xia_method.explain(x_test[i,:,:], labels=label, TSR=True, attribution=0.0, baseline_single=np.zeros(x_test.shape[1:]), assignment=0) meaningless By the way, because the dataset is synthetic, I know the ground truth. Dataset is just a random noise plus a portion of sine wave.

JHoelli commented 6 months ago

Hi @shrezaei,

my guess is that the problem is the normalization of the attributions to range of [0,1] (leaned onto [1]). In such a case the heatmap can lead to misleading visualizations.

I opened up a parameter that takes away the normalization here: pip install https://github.com/fzi-forschungszentrum-informatik/TSInterpret/archive/refs/heads/LEFTIST.zip

Simply set normalization = False when instantiating the method. int_mod=TSR(model, train_x.shape[-2],train_x.shape[-1], method='IG', mode='time', normalize=False)

Hope the results are now identical.

[1] Ismail, Aya Abdelsalam, et al. "Benchmarking deep learning interpretability in time series predictions." Advances in neural information processing systems 33 (2020): 6441-6452.

shrezaei commented 6 months ago

That is a good feature to have. I appreciate it. But, I don't think that is the issue. In the Captum output, the attribution is centered around the sine wave, although values are relatively small. So, if one normalizes it, it should be still centered around the sine wave. But, the output of TSInterpret capturing majority of the signal, but not the sine wave at all. It doesn't matter if you normalize it or not. If it's zero (assuming all attribution values are non-negative), it will remain zero, unless there is a bug in the normalization function.

JHoelli commented 6 months ago

Hi @shrezaei,

I just realized that TSR = True (So temporal Saliency Rescaling is turned on). If you want to obtain the same results as with captum: TSR = False, normalization=False and the baseline setting of above.

For TSR = True, I could find a problem for univariate data:

For univariate data the calculation of the time series feature importance leads to many time step being attributed with 0.1 as this is the fallback if the time contribution is not larger than the mean of all contributions. If the mean of all contribution is smaller that the time contribution at time step t, the feature contribution x time contribution leads to squaring (in case of attribution <1, the attribution gets smaller and smaller). In such a inconvenient setting, the default setting can lead to a higher attribution than the original attribution. If we only have one time series feature, it is supposed to be one (only feature = always most important feature).

It is fixed here :

pip install https://github.com/fzi-forschungszentrum-informatik/TSInterpret/archive/refs/heads/LEFTIST.zip

Thanks for pinpointing the issue.

JHoelli commented 5 months ago

Closed due to inactivity.