nlpyang / PreSumm

code for EMNLP 2019 paper Text Summarization with Pretrained Encoders
MIT License
1.28k stars 465 forks source link

Extractive summarization with raw text input not working #122

Open JohannesTK opened 4 years ago

JohannesTK commented 4 years ago

Extractive summarization with raw text input returns the same text until the first CLS separator.

Model: CNN/DM Extractive

Command to reproduce:

python3 train.py -task ext -mode test_text -test_from /home/ubuntu/PreSumm/models/bertext_cnndm_transformer.pt -text_src /home/ubuntu/PreSumm/raw_data/temp_ext.raw_src -result_path /home/ubuntu/PreSumm/output.txt -visible_gpus 0 -max_pos 512 -max_length 200 -alpha 0.95 -min_length 50
[2020-01-29 22:15:53,900 INFO] Loading checkpoint from /home/ubuntu/PreSumm/models/bertext_cnndm_transformer.pt
Namespace(accum_count=1, alpha=0.95, batch_size=140, beam_size=5, bert_data_path='../bert_data_new/cnndm', beta1=0.9, beta2=0.999, block_trigram=True, dec_dropout=0.2, dec_ff_size=2048, dec_heads=8, dec_hidden_size=768, dec_layers=6, enc_dropout=0.2, enc_ff_size=512, enc_hidden_size=512, enc_layers=6, encoder='bert', ext_dropout=0.2, ext_ff_size=2048, ext_heads=8, ext_hidden_size=768, ext_layers=2, finetune_bert=True, generator_shard_size=32, gpu_ranks=[0], label_smoothing=0.1, large=False, load_from_extractive='', log_file='../logs/cnndm.log', lr=1, lr_bert=0.002, lr_dec=0.002, max_grad_norm=0, max_length=200, max_ndocs_in_batch=6, max_pos=512, max_tgt_len=140, min_length=50, mode='test_text', model_path='../models/', optim='adam', param_init=0, param_init_glorot=True, recall_eval=False, report_every=1, report_rouge=True, result_path='/home/ubuntu/PreSumm/output.txt', save_checkpoint_steps=5, seed=666, sep_optim=False, share_emb=False, task='ext', temp_dir='../temp', test_all=False, test_batch_size=200, test_from='/home/ubuntu/PreSumm/models/bertext_cnndm_transformer.pt', test_start_from=-1, text_src='/home/ubuntu/PreSumm/test.txt', text_tgt='', train_from='', train_steps=1000, use_bert_emb=False, use_interval=True, visible_gpus='0', warmup_steps=8000, warmup_steps_bert=8000, warmup_steps_dec=8000, world_size=1)
[2020-01-29 22:15:54,798 INFO] loading configuration file https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-uncased-config.json from cache at ../temp/4dad0251492946e18ac39290fcfe91b89d370fee250efe9521476438fe8ca185.bf3b9ea126d8c0001ee8a1e8b92229871d06d36d8808208cc2449280da87785c
[2020-01-29 22:15:54,798 INFO] Model config {
  "attention_probs_dropout_prob": 0.1,
  "finetuning_task": null,
  "hidden_act": "gelu",
  "hidden_dropout_prob": 0.1,
  "hidden_size": 768,
  "initializer_range": 0.02,
  "intermediate_size": 3072,
  "layer_norm_eps": 1e-12,
  "max_position_embeddings": 512,
  "num_attention_heads": 12,
  "num_hidden_layers": 12,
  "num_labels": 2,
  "output_attentions": false,
  "output_hidden_states": false,
  "pruned_heads": {},
  "torchscript": false,
  "type_vocab_size": 2,
  "vocab_size": 30522
}

[2020-01-29 22:15:54,822 INFO] loading weights file https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-uncased-pytorch_model.bin from cache at ../temp/aa1ef1aede4482d0dbcd4d52baad8ae300e60902e88fcb0bebdec09afd232066.36ca03ab34a1a5d5fa7bc3d03d55c4fa650fed07220e2eeebc06ce58d0e9a157
gpu_rank 0
[2020-01-29 22:16:02,454 INFO] * number of parameters: 120512513
[2020-01-29 22:16:02,540 INFO] loading vocabulary file https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-uncased-vocab.txt from cache at /home/ubuntu/.cache/torch/pytorch_transformers/26bc1ad6c0ac742e9b52263248f6d0f00068293b33709fae12320c0e35ccfbbb.542ce4285a40d23a559526243235df47c5f75c197f04f37d1a0c124c32c9a084
 50%|███████████████████████████████████████████████████████████████▌                                                               | 1/2 [00:00<00:00, 40.63it/s]
[2020-01-29 22:16:02,644 INFO] Validation xent: 0 at step -1

Output:

cat output.txt_step-1.candidate 
this Terry Jones had a love of the absurd that contributed much to the anarchic humour of Monty Python's Flying Circus.
(CNN) An Iranian chess referee says she is frightened to return home after she was criticized online for not wearing the appropriate headscarf during an international tournament.
GabrielBianconi commented 4 years ago

I am having the same issue with:

python train.py -mode test_text -task ext -test_from ../weights/bertext_cnndm.pt -text_src ../raw_data/temp_ext.raw_src

I tried other articles myself (with sentences segmented with [CLS] [SEP] like in the examples) and the model only returns the first sentence for each.

dardodel commented 4 years ago

I have got the same issue too. I also cleaned up the input text file but it did not help either. Any update? BTW, how is the BERTSUM extractive algorithm? Does it "select" the best sentence(s) from the original article that represent the article best?

Here is the command line I use: python3 ./src/train.py -task ext -mode test_text -test_from ./models/bertext_cnndm_transformer.pt -text_src ./raw_data/temp_ext.raw_src -result_path ./results/output.txt -log_file ./logs/test.log -visible_gpus -1 -max_pos 512 -max_length 200 -alpha 0.95 -min_length 50

Vinothsuku commented 4 years ago

Anyone has a solution or could direct us? I do have the same issue. Output of the extraction model just returns first sentence of each segment.

dardodel commented 4 years ago

Anyone has a solution or could direct us? I do have the same issue. Output of the extraction model just returns first sentence of each segment.

I am able to run the following command, I needed to update the code as well (just follow the error messages and fix them manually): python3 ./src/train.py -task abs -mode test_text -text_src ./raw_data/test_text.txt -batch_size 140 -test_batch_size 200 -log_file ./logs/pred.log -test_from ./models/model_step_148000.pt -sep_optim true -use_interval true -max_pos 512 -max_length 20 -alpha 0.95 -min_length 5 -result_path ./results -test_all True

Note that you can change the max and min length in the command line based on your data. Since the model I used ("model_step_148000.pt") is trained by CNN news it can summarize the news abstractively very well. When I wrote my own story, it just picked the best sentences of my own story (so more like extractive rather than abstractive, but the sentences were not necessarily from the beginning of the text. Note that each story has to be in one line for abstractive mode, as mentioned in the original github). I found that if your story is too short (not long enough) and if your all sentences convey similar meaning, the model will likely select the primary sentences.

BTW, I am not still able to run the extractive mode yet.

Vinothsuku commented 4 years ago

@dardodel Thanks for the details shared related to abstractive summarization. Question on extractive model still remains:(

ellitz commented 4 years ago

I am having the same issues when I run the extractive method. I always get the first sentence, even when with their example as input. Please can someone help?

I agree with @dardodel. I experienced the same. The abstractive actually can be used as an extractive method but still the result is not the best. It really depends on what you include in the txt file.

xnancy commented 4 years ago

See https://github.com/nlpyang/PreSumm/issues/130#issuecomment-600965008 . After digging round a bit, I found a fix with the data_loader.

kchax4377 commented 4 years ago

Anyone has a solution or could direct us? I do have the same issue. Output of the extraction model just returns first sentence of each segment.

I am able to run the following command, I needed to update the code as well (just follow the error messages and fix them manually): python3 ./src/train.py -task abs -mode test_text -text_src ./raw_data/test_text.txt -batch_size 140 -test_batch_size 200 -log_file ./logs/pred.log -test_from ./models/model_step_148000.pt -sep_optim true -use_interval true -max_pos 512 -max_length 20 -alpha 0.95 -min_length 5 -result_path ./results -test_all True

Note that you can change the max and min length in the command line based on your data. Since the model I used ("model_step_148000.pt") is trained by CNN news it can summarize the news abstractively very well. When I wrote my own story, it just picked the best sentences of my own story (so more like extractive rather than abstractive, but the sentences were not necessarily from the beginning of the text. Note that each story has to be in one line for abstractive mode, as mentioned in the original github). I found that if your story is too short (not long enough) and if your all sentences convey similar meaning, the model will likely select the primary sentences.

BTW, I am not still able to run the extractive mode yet.

@dardodel I have sample of one line texts in a text file. When I executed your suggest command as shown below, no results were generated in the ./results directory. python3 ./src/train.py -task abs -mode test_text -text_src ./raw_data/test_text.txt -batch_size 140 -test_batch_size 200 -log_file ./logs/pred.log -test_from ./models/model_step_148000.pt -sep_optim true -use_interval true -max_pos 512 -max_length 20 -alpha 0.95 -min_length 5 -result_path ./results -test_all True

nlpyang commented 4 years ago

This is indeed a bug, I have pushed a update to fix this. Sorry for this.

nikisix commented 4 years ago

@nlpyang, in master? What was the fix? Asking because I pulled and still get the empty results, first sentence(s) only summarization, and empty gold file problems.

dhouhaomri commented 4 years ago

Hi, when i run this command no results were generated in results dirctory.

python /users/omri/workspace/Trainbert/PreSumm/src/train.py -task ext -mode test_text -test_from /users/omri/workspace/Trainbert/PreSumm/models/ext_model/model_step_50000.pt -text_src /users/omri/workspace/Trainbert/PreSumm/raw_data/temp_ext.raw_src -text_tgt /users/omri/workspace/Trainbert/PreSumm/results/result.raw_tgt -log_file ../logs/ext_bert_cnndm -visible_gpus 1

How can i fix that?