allenai / allennlp

An open-source NLP research library, built on PyTorch.
http://www.allennlp.org
Apache License 2.0
11.76k stars 2.25k forks source link

AllenNLP Interpret usage questions #3442

Closed Crista23 closed 4 years ago

Crista23 commented 5 years ago

Can you please provide an end-to-end example of how to run AllenNLP interpret on some custom text input? Thanks!

matt-gardner commented 5 years ago

The interpretation code is for interpreting particular model predictions - what model are you using that you want to interpret?

Crista23 commented 5 years ago

I am interested in using AllenNLP Interpret in combination with BERT, XLNET and RoBERTa.

matt-gardner commented 5 years ago

That still doesn't answer the question - we need more detail on what exactly you want to do before we can give specific instructions. There is some general tutorial information here: https://allennlp.org/interpret.

Crista23 commented 5 years ago

I want to visualize saliency maps for text input, thanks!

matt-gardner commented 5 years ago

For BERT, you can see this on demo.allennlp.org. For the others, you need to create a model similar to the one that's backing the demo. We'll make instructions for that soon.

Crista23 commented 5 years ago

Thanks for the info! For BERT I am interested in doing this programatically, not through the web interface. Do you have any instructions on how to do that?

matt-gardner commented 5 years ago

@Eric-Wallace, do we have any tutorial anywhere on how to use these things programmatically?

Until we have one, you can examine the demo code. Particularly, the interpreters get instantiated here: https://github.com/allenai/allennlp-demo/blob/30d1d63e37ee4f5096a39d3e29543a89ab62f372/app.py#L133-L150

They get called here: https://github.com/allenai/allennlp-demo/blob/30d1d63e37ee4f5096a39d3e29543a89ab62f372/app.py#L443 and here: https://github.com/allenai/allennlp-demo/blob/30d1d63e37ee4f5096a39d3e29543a89ab62f372/app.py#L392-L395. You can see the documentation for the interpreters and attackers by looking at the base classes in here: https://github.com/allenai/allennlp/tree/master/allennlp/interpret.

DeNeutoy commented 4 years ago

Closing due to inactivity

pydn commented 4 years ago

Sorry to comment on this after it is closed, but I would like to know if there is any follow up on @matt-gardner 's question to @Eric-Wallace. Are there any tutorials available for how to use interpret programmatically? I'd like to use the input reduction interpreter in batch on thousands of documents at a time on a text classification model.

Alternatively, if there is an example of a config.jsonnet file that shows how to call the attacker, I think that would be a huge help.

Thanks for the help!

matt-gardner commented 4 years ago

For calling the attacker, the links I shared above to how we do it in our demo should be good enough. There is no config - you just train a model, load it in a python script, then call the methods linked above. I'm not sure how batching works there; if you run into issues with batching things, let me know and I'll look into it (preferably by opening a separate issue).

For programmatic usage of the interpret code, @sanjayss34 did this with LXMert, and has code available. I'm not sure whether it's public yet, but I know he's working on making it ready to share; Sanjay, is there anything you can share yet with @pydn? Even some small code snippets on what functions to call would be helpful.

pydn commented 4 years ago

Hi @matt-gardner, thanks so much for all of your work on this project! I wanted to quickly close the loop on this, because I was able to work out the solution using your references.

I made an example of a predictor based off of the TextClassifierPredictor in allennlp/predictors/text_classifier.py that will make a prediction on the given text and perform input reduction on every result.

I'm certain there is a more eloquent way to write this, but I'm posting in case it might help someone else having the same issue.

from allennlp.common.util import JsonDict
from allennlp.predictors.predictor import Predictor
from allennlp.interpret.attackers import Attacker, InputReduction
from allennlp.predictors.text_classifier import TextClassifierPredictor

class InputReductionTextClassifierPredictor(Predictor):

    def predict_json(self, json_dict: JsonDict) -> JsonDict:
        predictor = TextClassifierPredictor(self._model, self._dataset_reader)
        prediction = predictor.predict(sentence=json_dict['sentence'])

        attacker = InputReduction(predictor)
        attack = attacker.attack_from_json(inputs=json_dict,
                                           input_field_to_attack='tokens',
                                           grad_input_field='grad_input_1',
                                           ignore_tokens=None)

        return {'prediction': prediction, 'input_reduction_output': attack}
matt-gardner commented 4 years ago

Glad you got it working!

savinay commented 4 years ago

Sorry to comment after the issue is closed.

I am trying to create explanations for the predictions by the model as is shown on demo page: https://demo.allennlp.org/named-entity-recognition/MjExNTAzMg==

I have the following code: from allennlp.interpret.saliency_interpreters import SimpleGradient SimpleGradient(predictor).saliency_interpret_from_json({'sentence':"This shirt was bought at Grandpa Joes in downtown Deep Learning"}

and it returns me the following output: {'instance_1': {'grad_input_1': [0.7466164526464354, 0.1440905519589676, 0.018291907524359417, 0.00600954049677358, 0.041508674938798215, 0.004028081612403084, 0.024965327932526333, 0.00107885332931967, 0.004773362323580226, 0.00516803867296153, 0.0034692091461368253]}, 'instance_2': {'grad_input_1': [0.27729641993428905, 0.026113090527999944, 0.0002990537550420747, 0.008275695387992821, 0.00905805058699953, 0.06128057875171651, 0.19175197183390483, 0.2671991010403639, 0.03189344164556901, 0.10588050542080134, 0.02095201454816753]}}

Is there a way to convert this output to the visualization as shown on the demo page or do we have to build our own UI?

vinayakathavale commented 4 years ago

@savinay as far is i can understand, in the output you get saliency scores of each token wrt. each entity predicted, so scores in instance_1 refer to saliency scores per token for entity "Grandpa Joes" and instance_2 is for "Deep Learning" which are the 2 entities in the sentence. Using this info you can easily construct your own visualization.

if you want to see the visualizations as shown in the demo page i believe you can do that by building code from this repo https://github.com/allenai/allennlp-demo

beinborn commented 3 years ago

Thank you very much for making the "interpret" package available. I am currently trying out the gradient-based interpreters for masked language modeling and have two questions:

1) If I understand it correctly, the gradient is calculated with respect to the token for which the model assigns the highest probability in the MASK position. If I want to calculate the gradient with respect to the original token, would it be enough to override this line in masked_language_model.predictions_to_labeled_instances? mask_targets = [Token(target_top_k[0]) for target_top_k in outputs["words"]]

So, instead I would have mask_targets = [Token("original_token")]? Or do I need to change anything else?

2) I was surprised to see that the gradient for the MASK token is quite high. How can I conceptually interpret this? The presence of MASK contributes significantly to the prediction of "beautiful" in the example below?

Input: "This is a MASK day." Prediction: "beautiful"

Output for Simple Gradients and your bert model:

This: 0.07 is: 0.40 a: 0.04

day: 0.16 .: 0.18

Output for Integrated Gradients:

This: 0.03 is: 0.21 a: 0.04

day: 0.04 .: 0.17

Interestingly, when I replace "day" with "game", the gradient for MASK gets much smaller (0.04) but it is still higher than the gradient for "This".