jalammar / ecco

Explain, analyze, and visualize NLP language models. Ecco creates interactive visualizations directly in Jupyter notebooks explaining the behavior of Transformer-based language models (like GPT2, BERT, RoBERTA, T5, and T0).
https://ecco.readthedocs.io
BSD 3-Clause "New" or "Revised" License
1.96k stars 167 forks source link

Add support for PEGASUS model #63

Closed thomas-chong closed 2 years ago

thomas-chong commented 2 years ago

I would like to add the support of PEGASUS in model-config.yaml.

PEGASUS model is an encoder-decoder type and the implementation is completely inherited from BartForConditionalGeneration. So the config is similar to the BART model.

Notes: This is my first time making a pull request on an open-source project, but hope this helps!

jalammar commented 2 years ago

Thank you for the contribution @thomas-chong. Were you able to run the model and use it to generate text?

thomas-chong commented 2 years ago

Yes @jalammar . I have implemented and it worked perfectly to generate abstractive summary with PEGASUS.

image

jalammar commented 2 years ago

@thomas-chong Brilliant! Can you please change the token prefix to the character: '▁' (instead of the normal underscore '_').

What would you also think of adding other Pegasus models since they'll likely use the same config:

https://huggingface.co/google/pegasus-xsum https://huggingface.co/google/pegasus-large

for example

jalammar commented 2 years ago

Just for my records, to update the notebook with an example later:

!pip install sentencepiece

import ecco
lm = ecco.from_pretrained('google/pegasus-cnn_dailymail', verbose=False)
prompt=""" The tower is 324 metres (1,063 ft) tall, about the same height as an 81-storey building, and the tallest structure in Paris. Its base is square, measuring 125 metres (410 ft) on each side. During its construction, the Eiffel Tower surpassed the Washington Monument to become the tallest man-made structure in the world, a title it held for 41 years until the Chrysler Building in New York City was finished in 1930. It was the first structure to reach a height of 300 metres. Due to the addition of a broadcasting aerial at the top of the tower in 1957, it is now taller than the Chrysler Building by 5.2 metres (17 ft). Excluding transmitters, the Eiffel Tower is the second tallest free-standing structure in France after the Millau Viaduct."""

output = lm.generate(prompt, generate=50, do_sample=True)
output

Attribution works but runs out of memory for me. So some optimization is likely needed in the future.

thomas-chong commented 2 years ago

@jalammar I have added all the PEGASUS downstream models to model-config.yaml as well.

jalammar commented 2 years ago

Brilliant