Can't get correct output using .saliency_interpret_from_json() from Interpret

allenai / allennlp

An open-source NLP research library, built on PyTorch.

http://www.allennlp.org

Apache License 2.0

11.76k stars 2.25k forks source link

Can't get correct output using .saliency_interpret_from_json() from Interpret #4360

Closed koren-v closed 4 years ago

koren-v commented 4 years ago

Hi, I'm trying to get the importance of each word in a sentence while sentiment classification but always get the same result for different inputs. The Code:

from allennlp.predictors.predictor import Predictor
import allennlp_models.classification
from allennlp.interpret.saliency_interpreters import (
    SaliencyInterpreter,
    SimpleGradient,
    SmoothGradient,
    IntegratedGradient,
)

predictor = Predictor.from_path("https://storage.googleapis.com/allennlp-public-models/sst-roberta-large-2020.06.08.tar.gz")
simple_grad = SimpleGradient(predictor)

input_json = {'sentence' : "a very well-made, funny and entertaining picture."}

simple_grad.saliency_interpret_from_json(input_json)

gives:

{'instance_1': {'grad_input_1': [1.0,
   0.0,
   0.0,
   0.0,
   0.0,
   0.0,
   0.0,
   0.0,
   0.0,
   0.0,
   0.0]}}

So what can be the reason of this problem?

matt-gardner commented 4 years ago

No idea what's causing the issue yet, but I have confirmed that this is a bug. I get the correct output with rc3, but it's broken with rc6.

koren-v commented 4 years ago

@matt-gardner Thanks for your answer.

koren-v commented 4 years ago

@matt-gardner Is it possible way to load this model in rc3? Because using the same code with this version gives:

ConfigurationError: sst_tokens not in acceptable choices for validation_dataset_reader.type: ['conll2003', 'interleaving', 'sequence_tagging', 'sharded', 'babi', 'text_classification_json']. You should either use the --include-package flag to make sure the correct module is loaded, or use a fully qualified class name in your config file like {"model": "my_module.models.MyModel"} to have it imported automatically.

matt-gardner commented 4 years ago

Yeah, just include this line in your script: from allennlp_models import sentiment.

matt-gardner commented 4 years ago

github-actions[bot] commented 4 years ago

@matt-gardner this is just a friendly ping to make sure you haven't forgotten about this issue 😜

matt-gardner commented 4 years ago

I thought the issue might have been resolved with recent changes to the SST model / tokenization stuff. But I just ran the model on master (both this and the models repo), and had the same issue. This is still not resolved.

We're giving a tutorial at EMNLP on interpreting predictions, and this should definitely be fixed before then. I'm putting this into the 1.2 milestone, but only so we don't forget about it; it really should be in a 1.3 or a 2.1 milestone, or something, but those don't exist yet.

dirkgr commented 4 years ago

Just because it hasn't been mentioned yet: The problem is that it takes the gradients at the top of the transformer, not the bottom, when using the mismatched embedder. Since the pooler takes only the first token ([CLS]), all the gradients are there.

dirkgr commented 4 years ago

@matt-gardner, do you have an idea of when you will get to this?

matt-gardner commented 4 years ago

I have some thoughts on how to fix this. If I'm lucky, I can maybe get to it tomorrow; if not, I'm not sure, but I might be able to make it happen next week.

dirkgr commented 4 years ago

I couldn't get the demo to work end-to-end, but I have confirmed that the example from in here works now as expected.

cfregly commented 4 years ago

@dirkgr can you share which example works? i am still having trouble getting this working. i tried 1.0.0rc3 (as suggested way up above) and see this error: ConfigurationError: Cannot register textual_entailment as Predictor; name already in use for TextualEntailmentPredictor

this still seems broken.

i tried 1.1.0, 1.2.0, and master and i see the incorrect array [1.0, 0.0, 0.0, ...].

here is my code: https://github.com/data-science-on-aws/workshop/blob/1958164/07_train/wip/99_AllenNLP_RoBERTa_Prediction.ipynb

any help would be appreciated! hoping to get a working demo of this soon.

should we re-open this?

dirkgr commented 4 years ago

You have to install master from both allennlp and allennlp-models. Then copy-and-paste the code from the description above. When I do this, I see reasonable numbers. I also made a test for this here: https://github.com/allenai/allennlp-models/pull/163