DFKI-NLP / thermostat

Collection of NLP model explanations and accompanying analysis tools
Apache License 2.0
145 stars 8 forks source link

[InputXGradient] RuntimeError: One of the differentiated Tensors does not require grad #5

Closed nfelnlp closed 3 years ago

nfelnlp commented 3 years ago

Stuck at this error while implementing InputXGradient. Tested on DistilBERT and RoBERTa.

Traceback (most recent call last):
  File "/home/nfel/.pycharm_helpers/pydev/_pydevd_bundle/pydevd_exec2.py", line 3, in Exec
    exec(exp, global_vars, local_vars)
  File "<input>", line 1, in <module>
  File "/home/nfel/PycharmProjects/thermostat/venv/lib/python3.8/site-packages/captum/log/__init__.py", line 35, in wrapper
    return func(*args, **kwargs)
  File "/home/nfel/PycharmProjects/thermostat/venv/lib/python3.8/site-packages/captum/attr/_core/input_x_gradient.py", line 117, in attribute
    gradients = self.gradient_func(
  File "/home/nfel/PycharmProjects/thermostat/venv/lib/python3.8/site-packages/captum/_utils/gradient.py", line 125, in compute_gradients
    grads = torch.autograd.grad(torch.unbind(outputs), inputs)
  File "/home/nfel/PycharmProjects/thermostat/venv/lib/python3.8/site-packages/torch/autograd/__init__.py", line 223, in grad
    return Variable._execution_engine.run_backward(
RuntimeError: One of the differentiated Tensors does not require grad
  0%|                                                  | 0/1821 [07:25<?, ?it/s]
Traceback (most recent call last):
  File "/home/nfel/PycharmProjects/thermostat/venv/lib/python3.8/site-packages/captum/_utils/gradient.py", line 125, in compute_gradients
    grads = torch.autograd.grad(torch.unbind(outputs), inputs)
  File "/home/nfel/PycharmProjects/thermostat/venv/lib/python3.8/site-packages/torch/autograd/__init__.py", line 223, in grad
    return Variable._execution_engine.run_backward(
RuntimeError: One of the differentiated Tensors does not require grad
nfelnlp commented 3 years ago

Came up with a way to at least make the attribute function work for GuidedBackprop (and would work for Input x Gradient as well): https://github.com/nfelnlp/thermostat/blob/a8d74650f8ba4f4b0f6504cf8c06043e492d6ed3/src/thermostat/explainers/grad.py#L115

However, this meant:

This results in an attributions shape of (1, 512, 768) which probably doesn't make sense at all.

nfelnlp commented 3 years ago

After consulting with @rbtsbg, InputXGradient and GuidedBackprop will be substituted by LayerGradientXActivation which works similar to LayerIntegratedGradients and is easier to handle with just passing the base model's embedding layer to the explainer at creation time.