jalammar / ecco

Explain, analyze, and visualize NLP language models. Ecco creates interactive visualizations directly in Jupyter notebooks explaining the behavior of Transformer-based language models (like GPT2, BERT, RoBERTA, T5, and T0).
https://ecco.readthedocs.io
BSD 3-Clause "New" or "Revised" License
1.96k stars 167 forks source link

How to retrieve salience of some specific words? #105

Open CarhoJohn opened 1 year ago

CarhoJohn commented 1 year ago

Hi. To obtain the salience map of previous tokens when generating new tokens, we can use the code/function provided in the example code:

output = lm.generate(prompt, generate=1, do_sample=True, attribution=['ig'])
res = output.primary_attributions(attr_method='ig')

However, in this standard method, I can only get the salience map for the (randomly/uncontrollable) generated word.

Is it possible to obtain the salience map for specific word? For example, in the sentence "I have a dog. He is very ...", I'd like to get the salience map for a specific word cute, rather than other words generated by the model.

Thanks very much!

BiEchi commented 12 months ago

From my understanding this is not possible unless you do algorithmic optimization (some math). Salience maps is doing backprop from output to embedding. This process is just chain rule, and if you break it you do get specific words, but unless mathematically grounded, your approach fails.