Closed MichaelJanz closed 3 years ago
Seems to be a problem with the 'google/pegasus-arxiv' model, when you use 'google/pegasus-xsum' you get:
Harry Potter and the Philosopher’s Stone is the seventh and final book in JK Rowling’s Harry Potter series.
as output
Yes I tried different pegasus models (including alot of other models) and pegasus-large e.G. outputs this (which I think is really good1): In this sequel to the phenomenally popular Harry Potter and the Sorcerer’s Stone, Harry returns to Hogwarts School of Witchcraft and Wizardry for his second year after a miserable summer with his Muggle (nonmagical) relatives. Rowling clearly hit on a winning formula with the first Harry Potter book; the second book — though still great fun — feels a tad, well, formulaic.'
while pegasus-multinews outputs pretty well generated texts, but unfortunately wrong in the content: – The seventh and final book in the Harry Potter series, Harry Potter and the Sorcerer\'s Stone, is out today. The sixth book in the series, Harry Potter and the Deathly Hallows, was released in the US in advance of tomorrow\'s release in the UK. Here\'s what critics are saying about the seventh and final book in the series: The plot is still compelling, but the book "feels a tad, well, formulaic," writes James Poniewozik in Time. "The atmosphere Rowling creates is unique; the story whizzes along; Harry is an unassuming and completely sympathetic hero. But, truth to tell, you may feel as if you\'ve read it all before. Rowling clearly hit on a winning formula with the first Harry Potter book; the second book—though still great fun—feels a tad, well, formulaic."'
Gigaword and billsum are both also outputting non useful texts.
Also another question, while pegasus-large and pegasus-cnn_dailymail both only return the most important sentences, pegasus-multinews generates even new text. I was hoping the same for the arxiv model, is there a reason that it differs in that way?
pegasus-arxiv
is trained on and expects scientific text.
pegasus-multinews
expects news I presume.
If you want to prove a bug, try running an evaluation on a public dataset from the datasets package, and posting the result #6844 .
transformers version: 3.1.0 Platform: Windows - 10 Python version: 3.7.6 PyTorch version (GPU?): 1.5.0 (False) Using GPU in script?: no Using distributed or parallel set-up in script?: no
I found unexpected behaviour when using Pegasus-Pubmed on Pubmed document.
import torch
from transformers import PegasusForConditionalGeneration, PegasusTokenizer
src_text ="""although the association is modest , it is important because of the increasing prevalence of metabolic syndrome and the effect that depression can have on the ability of patients to successfully make lifestyle changes and comply with medication required for hypertension and dyslipidemia . the association is demonstrated here in a general population to our knowledge for the first time , whereas earlier studies ( table 1 ) used subgroups of populations ( 813,17 ) . this distinction is important because many individuals with metabolic syndrome have diabetes , which itself is known to be associated with depression ( 5 ) . metabolic syndrome has been defined in several ways that involve quantitative anthropometric , clinical , and laboratory measurements ( 1,2 ) . for the primary assessment , we chose ncep atp iii ( 1 ) criteria , since these criteria were used in most of the previously reported studies ( 8,9,1113,17 ) ."""
model_name = 'google/pegasus-pubmed'
torch_device = 'cuda' if torch.cuda.is_available() else 'cpu'
tokenizer = PegasusTokenizer.from_pretrained(model_name)
model = PegasusForConditionalGeneration.from_pretrained(model_name).to(torch_device)
batch = tokenizer.prepare_seq2seq_batch([src_text], truncation=True, padding='longest').to(torch_device)
translated = model.generate(**batch)
tgt_text = tokenizer.batch_decode(translated, skip_special_tokens=True)
print(tgt_text)
I expect a summary of the input but i received a longer (relative to the input) version of the text, input has length of 929 vs 1129 of predicted summary. In particular Pegasus generate new knowledge (bold text) which isn't inside the input text.
Input:
although the association is modest , it is important because of the increasing prevalence of metabolic syndrome and the effect that depression can have on the ability of patients to successfully make lifestyle changes and comply with medication required for hypertension and dyslipidemia . the association is demonstrated here in a general population to our knowledge for the first time , whereas earlier studies ( table 1 ) used subgroups of populations ( 813,17 ) . this distinction is important because many individuals with metabolic syndrome have diabetes , which itself is known to be associated with depression ( 5 ) . metabolic syndrome has been defined in several ways that involve quantitative anthropometric , clinical , and laboratory measurements ( 1,2 ) . for the primary assessment , we chose ncep atp iii ( 1 ) criteria , since these criteria were used in most of the previously reported studies ( 8,9,1113,17 ) .
Output:
['depression is known to be associated with metabolic syndrome, but its association with metabolic syndrome has not been studied in a general population.
Is that behaviour correct?
Output should be < 256 tokens (not characters). Input should probably be longer (closer to 1024 tokens). Try copying something from the leftmost column of the dataset
We've now replicated that our pegasus port performs similarly well to the authors implementation on 11 datasets, including arxiv.
Environment info
transformers
version: 3.1.0Who can help
@sshleifer
Information
Model I am using (Pegasus-Arxiv):
The problem arises when using:
To reproduce
Steps to reproduce the behavior:
Expected behavior
I expect a clear summary of the text, but I receive a text with no connection to the input, written as a scientific paper:
['this is the first of a series of papers in which we address the question of whether or not the laws of thermodynamics are valid in the limit of infinitely many degrees of freedom. we show that the laws of thermodynamics are valid in the limit of infinitely many degrees of freedom. this is the first of a series of papers in which we address the question of whether or not the laws of thermodynamics are valid in the limit of infinitely many degrees of freedom. we show that the laws of thermodynamics are valid in the limit of infinitely many degrees of freedom. [ theorem]acknowledgement [ theorem]algorithm [ theorem]axiom [ theorem]claim [ theorem]conclusion [ theorem]condition [ theorem]conjecture [ theorem]corollary [ theorem]criterion [ theorem]definition [ theorem]example [ theorem]exercise [ theorem]lemma [ theorem]notation [ theorem]problem [ theorem]proposition [ theorem]remark [ theorem]solution [ theorem]summary this is the first of a series of papers in which we address the question of whether or not the laws of thermodynamics are valid in the limit of infinitely many degrees of freedom.']
Am I doing something wrong or is it the model? Thanks