allenai / PRIMER

The official code for PRIMERA: Pyramid-based Masked Sentence Pre-training for Multi-document Summarization
Apache License 2.0
153 stars 32 forks source link

Questions about inferencing #21

Open FightingEveryDay0 opened 2 years ago

FightingEveryDay0 commented 2 years ago

Hi, thank you for your sharing. I have trouble in using PRIMERA to generate summary. Could you please help me using the pretrained PRIMERA model generate the summary correctly? The code is as following:

import torch
from transformers import AutoTokenizer
from longformer import LongformerEncoderDecoderForConditionalGeneration
from longformer import LongformerEncoderDecoderConfig
import time
tokenizer = AutoTokenizer.from_pretrained('/data/users/wangyiting/primer/PRIMER-main/models/PRIMER_multinews')
config = LongformerEncoderDecoderConfig.from_pretrained('/data/users/wangyiting/primer/PRIMER-main/models/PRIMER_multinews')
model = LongformerEncoderDecoderForConditionalGeneration.from_pretrained(
            './PRIMERA_model/', config=config)

# import torch
# from longformer.longformer import Longformer, LongformerConfig
from longformer.sliding_chunks import pad_to_window_size
# from transformers import RobertaTokenizer

# SAMPLE_TEXT
start_time = time.time()
SAMPLE_TEXT = """An 11-year-old boy who survived being sucked into a flooded stormwater drain has been reunited with his rescuers in Melbourne and gifted a new bike a week after the tumultuous ordeal. Jake Gilbert was cycling with a friend in Altona Meadows last week when he rode across a submerged drain and was sucked 10 metres underneath a road. Stormwater drain ‘I love you all!’: boy sucked into stormwater drain in Melbourne praises rescuers after amazing escape. Gilbert managed to grab on to the underside of a metal grate on the other side and keep his head above water before passerby Damon Trewhella and off-duty SES member Justin Costello came to his aid. Kyle, who was also washed off his bike at the same time, had managed to avoid being sucked into the flooded stormwater drain. The SES member removed the bolts from the drain’s grate before the police officer prised the grate open – with Gilbert still desperately clinging to the underside by his fingernails. His head was just above the water before he was pulled to safety. he's getting her energy back and she's back to being a 'two-step launcher' when she goes to walk – takes two steps and launches off and takes your shoulders off – but prior to that, she'd lost all energy and she couldn't hold her own back legs up."""
input_ids = torch.tensor(tokenizer.encode(SAMPLE_TEXT)).unsqueeze(0)  # batch of size 1

# # TVM code doesn't work on CPU. Uncomment this if `config.attention_mode = 'tvm'`
# model = model.cuda(); input_ids = input_ids.cuda()

# Attention mask values -- 0: no attention, 1: local attention, 2: global attention
attention_mask = torch.ones(input_ids.shape, dtype=torch.long, device=input_ids.device) # initialize to local attention
attention_mask[:, [1, 4, 21,]] =  2  # Set global attention based on the task. For example,
                                     # classification: the <s> token
                                     # QA: question tokens

# # padding seqlen to the nearest multiple of 512. Needed for the 'sliding_chunks' attention
input_ids, attention_mask = pad_to_window_size(
        input_ids, attention_mask, config.attention_window[0], tokenizer.pad_token_id)

max_output_len = 100
generated_ids = model.generate(input_ids=input_ids, attention_mask=attention_mask,
                                            use_cache=True, max_length=max_output_len,
                                            num_beams=1)
generated_str = tokenizer.batch_decode(generated_ids.tolist(), skip_special_tokens=True)
end_time = time.time()
print("spending: ", end_time-start_time)
print(generated_str[0])
FightingEveryDay0 commented 2 years ago

And the generated_str is: An 11-year-old boy who survived being sucked into a flooded stormwater drain has been reunited with his rescuers in Melbourne and gifted a new bike a week after the tumultuous ordeal. Jake Gilbert was cycling with a friend in Altona Meadows last week when he rode across a submerged drain and was sucked 10 metres underneath a road. Stormwater drain ‘I love you all!’: boy sucked into stormwater drain in Melbourne praises rescuers after amazing escape. Gilbert managed to

However, I think it is just the same as the input, but I don't know how to use PRIMERA to inference correctly.

FightingEveryDay0 commented 2 years ago

When using the primer_main.py, "output_ids" did not feed into the forward() function, sorry that I got trouble in using the pretrained model. Could you help me?

desis123 commented 1 year ago

@FightingEveryDay0 This is what I got when I used
model = LongformerEncoderDecoderForConditionalGeneration.from_pretrained( './PRIMER_multinews/', config=config)

and I have put max_output_len = 1000

An 11-year-old boy in Melbourne, Australia, is being praised for his survival after he was sucked into a flooded stormwater drain and saved by a passerby and an off-duty emergency services member. Jake Gilbert was cycling with friend Kyle when he crossed a submerged drain and was sucked 10 feet beneath a road, the Daily Telegraph reports. Gilbert managed to grab on to the underside of a metal grate on the other side and keep his head above water before the passerby and the off-duty SES member removed the bolts and pried the grate open. Kyle was also washed off his bike, but he managed to avoid being sucked into the drain. A week after the dramatic escape, Jake was reunited with his rescuers and given a new bike, the Telegraph reports.

SabrinaZhuangxx commented 1 year ago

I have encountered the same problem. Have you solved it?