pytorch / captum

Model interpretability and understanding for PyTorch
https://captum.ai
BSD 3-Clause "New" or "Revised" License
4.79k stars 483 forks source link

Layer Feature Attribution for RNN models #1236

Open jingyan-li opened 7 months ago

jingyan-li commented 7 months ago

Hello,

I have a problem conducting layer feature attribution for my GRU model.

from captum.attr import LayerIntegratedGradients

ig = LayerIntegratedGradients(foreward_func, model.encoder.embedding)
target_idx = 0
attr = ig.attribute(x, target=target_idx, additional_forward_args=(0.))

My model foreward_func is a GRU seq-to-seq model with an encoder and decoder. It has input with shape (Batch_Size, Timesteps, Features). It works well when I make inferences or other regular feature attribution. However, it has an error when I make layer feature attribution over the embedding layer of the GRU encoder:

image

Is it some dimension problem specifically regarding RNN? Could you please explain a bit how the dimensions of RNN input operated in the layer feature attribution?

Thank you!

vivekmig commented 6 months ago

Hi @jingyan-li , sorry for the delayed response. I think the issue in this case is that the layer is being reused in the RNN, which causes an issue with the way hooks are used to capture model outputs. Our hooks only support layers which are utilized once in the model. Since the layer is used multiple times, we cannot uniquely identify the occurrence for attribution.

We are looking into fixes to provide clearer error messages or support this use-case. In the meantime, you can try to perform layer attribution when making only one pass of the target layer and see if the issue is resolved. Hope this helps!

jingyan-li commented 5 months ago

Thanks! Layer attribution works well!