pytorch / captum

Model interpretability and understanding for PyTorch
https://captum.ai
BSD 3-Clause "New" or "Revised" License
4.73k stars 476 forks source link

Incorrect indexing for max pooling layers with DeepLIFT #1276

Open jmschrei opened 3 months ago

jmschrei commented 3 months ago

🐛 Bug

Between captum 0.5.0 and 0.6.0 the arguments grad_input and grad_output in the DeepLIFT implementation switched from being a tuple of length 1 to just being a PyTorch tensor. Some of the non-linearity functions switched correctly, such as nonlinear:

0.5.0: https://github.com/pytorch/captum/blob/c2be437a990b49ab9a45acaad0d380f5c8f57ae8/captum/attr/_core/deep_lift.py#L956 Now: https://github.com/pytorch/captum/blob/master/captum/attr/_core/deep_lift.py#L879

But I don't think the max pooling function got switched over.

0.5.0: https://github.com/pytorch/captum/blob/c2be437a990b49ab9a45acaad0d380f5c8f57ae8/captum/attr/_core/deep_lift.py#L1118 Now: https://github.com/pytorch/captum/blob/master/captum/attr/_core/deep_lift.py#L1023

This causes the first index of the tensor grad_input to be used rather than the full thing. I'm not sure I understood how this happened because grad_output in the same function seems to have been changed correctly.

0.5.0: https://github.com/pytorch/captum/blob/c2be437a990b49ab9a45acaad0d380f5c8f57ae8/captum/attr/_core/deep_lift.py#L1075 Now: https://github.com/pytorch/captum/blob/master/captum/attr/_core/deep_lift.py#L1023

jmschrei commented 2 months ago

Would it be possible to get a confirmation that this is, indeed, a bug, and not a misunderstanding of the code? Thanks!