jalammar / ecco

Explain, analyze, and visualize NLP language models. Ecco creates interactive visualizations directly in Jupyter notebooks explaining the behavior of Transformer-based language models (like GPT2, BERT, RoBERTA, T5, and T0).
https://ecco.readthedocs.io
BSD 3-Clause "New" or "Revised" License
1.96k stars 167 forks source link

[bug fix] seems the hot fix for T5 generation is not need for later version of transformers #61

Closed litanlitudan closed 2 years ago

litanlitudan commented 2 years ago

Hi Jay,

Thanks for the great tool that you built. I was playing around with it and ran into the following error on the T5-small example from readme.

---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
/tmp/ipykernel_1710801/2602227185.py in <module>
      4 to feed my soul. I didn't expect it to entirely blow my mind."""
      5 
----> 6 output = lm.generate(f"sst2 sentence: {review}", generate=1, do_sample=False, attribution=['ig'])
      7 output.primary_attributions(attr_method='ig', ignore_tokens=[0,1,2,3,4,5,6,43,44])

~/ecco/src/ecco/lm.py in generate(self, input_str, max_length, temperature, top_k, top_p, do_sample, attribution, generate, beam_size, **generate_kwargs)
    201             assert len(input_ids.size()) == 2 # will break otherwise
    202             if transformers.__version__ >= '4.13': # ALSO FIXME: awful hack. But seems to work?
--> 203                 decoder_input_ids = self.model._prepare_decoder_input_ids_for_generation(input_ids.shape[0], None, None)
    204             else:
    205                 decoder_input_ids = self.model._prepare_decoder_input_ids_for_generation(input_ids, None, None)

~/miniconda3/envs/ecco/lib/python3.9/site-packages/transformers/generation_utils.py in _prepare_decoder_input_ids_for_generation(self, input_ids, decoder_start_token_id, bos_token_id)
    420         decoder_start_token_id = self._get_decoder_start_token_id(decoder_start_token_id, bos_token_id)
    421         decoder_input_ids = (
--> 422             torch.ones((input_ids.shape[0], 1), dtype=torch.long, device=input_ids.device) * decoder_start_token_id
    423         )
    424         return decoder_input_ids

AttributeError: 'int' object has no attribute 'shape'

Here is the example code I was playing with

import ecco
lm = ecco.from_pretrained('t5-small', verbose=False)
review="""I have a well-documented weakness for sci-fi and expected Dune 
to feed my soul. I didn't expect it to entirely blow my mind."""

output = lm.generate(f"sst2 sentence: {review}", generate=1, do_sample=False, attribution=['ig'])
output.primary_attributions(attr_method='ig', ignore_tokens=[0,1,2,3,4,5,6,43,44])

After some debugging, I realized that the hotfix for T5 generation from previous commits might not be need. But I am not sure starting which exact version of transformers don't need this hack, so I thought it might be a good idea to file a quick PR for this to bring up attention.

joaonadkarni commented 2 years ago

I think that the intent of the code is correct. As you can see in the changes introduced by version 4.13 of the transformers lib (you can check those in this comparison with the previous version) the expected input of the function (_prepare_decoder_input_ids_for_generation in the src/transformers/generation_utils.py file) indeed changes from a tensor to an int.

But what I think causes this error (I also had it) is that the execution is wrong. As the code is just comparing strings, we have for e.g. "4.6.1" > "4.13".

I opened a PR that I believe solves this: https://github.com/jalammar/ecco/pull/62

jalammar commented 2 years ago

Thanks! Indeed fixed with #62.