google-research / bert

TensorFlow code and pre-trained models for BERT
https://arxiv.org/abs/1810.04805
Apache License 2.0
38.24k stars 9.62k forks source link

BERT for text summarization #352

Open ghost opened 5 years ago

ghost commented 5 years ago

BERT is designed to solve 11 NLP problems. Which includes text summarization.

Is there any example how can we use BERT for summarizing a document? An approach would do and and example code would be really great.

Thanks in advance

makamkkumar commented 5 years ago

I am also interested in seeking a reply to the above question https://github.com/google-research/bert/issues/352#issue-398233998 Kindly do reply Thanks

ghost commented 5 years ago

Are these 11 tasks listed here ? : https://ai.googleblog.com/2019/01/looking-back-at-googles-research.html?m=1

makamkkumar commented 5 years ago

I know the eleven tasks but wanted to know if anyone has used this for abstractive text summarization?

ghost commented 5 years ago

I did extractive summarization. After getting embedding I clustered them and took 1 sentence from each cluster. What are steps of abstractive summarization? Let me give a try.. On Jan 26, 2019 12:25 PM, "makamkkumar" notifications@github.com wrote:

I know the eleven tasks but wanted to know if anyone has used this for abstractive text summarization?

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/google-research/bert/issues/352#issuecomment-457808234, or mute the thread https://github.com/notifications/unsubscribe-auth/AeIVIPSFlUFGu3ZJtMO2V2jZD5_iLuBIks5vG_vTgaJpZM4Z7N9Q .

santhoshkolloju commented 5 years ago

https://github.com/santhoshkolloju/bert_summ I have replaced the Encoder part with Bert and kept the transformer decoder as it is . let me know if it helps

makamkkumar commented 5 years ago

I think you are using Google.colab. I want to run the same on a local machine which is having P4000 GPU with 8GB RAM it is modest but I think suffices my requirements. However i am unable to run it here. Can you tell me how to do a work around. Thanks in advance

santhoshkolloju commented 5 years ago

What is the error you get.. By default texar places all the tensors on gpu

makamkkumar commented 5 years ago

### While running this block i.e. the last block

_#tx.utils.maybe_create_dir(model_dir)

logging_file = os.path.join(model_dir, 'logging.txt')

model_dir = "gs://bert_summ/models/"uncased_L-12_H-768_A-12/bert_model.ckpt logging_file= "logging.txt" logger = utils.get_logger(logging_file) with tf.Session() as sess: sess.run(tf.global_variables_initializer()) sess.run(tf.local_variables_initializer()) sess.run(tf.tables_initializer())

smry_writer = tf.summary.FileWriter(model_dir, graph=sess.graph)

if run_mode == 'train_and_evaluate':
    logger.info('Begin running with train_and_evaluate mode')

    if tf.train.latest_checkpoint(model_dir) is not None:
        logger.info('Restore latest checkpoint in %s' % model_dir)
        saver.restore(sess, tf.train.latest_checkpoint(model_dir))

    iterator.initialize_dataset(sess)

    step = 5000
    for epoch in range(max_train_epoch):
      iterator.restart_dataset(sess, 'train')
      step = _train_epoch(sess, epoch, step, smry_writer)

elif run_mode == 'test':
    logger.info('Begin running with test mode')

    logger.info('Restore latest checkpoint in %s' % model_dir)
    saver.restore(sess, tf.train.latest_checkpoint(model_dir))

    _eval_epoch(sess, 0, mode='test')

else:
    raise ValueError('Unknown mode: {}'.format(run_mode))_

### The error I am getting is:-


PermissionDeniedError Traceback (most recent call last)

in 10 sess.run(tf.tables_initializer()) 11 ---> 12 smry_writer = tf.summary.FileWriter(model_dir, graph=sess.graph) 13 14 if run_mode == 'train_and_evaluate': ~/anaconda3/envs/tf-1.8/lib/python3.6/site-packages/tensorflow/python/summary/writer/writer.py in __init__(self, logdir, graph, max_queue, flush_secs, graph_def, filename_suffix) 350 351 event_writer = EventFileWriter(logdir, max_queue, flush_secs, --> 352 filename_suffix) 353 super(FileWriter, self).__init__(event_writer, graph, graph_def) 354 ~/anaconda3/envs/tf-1.8/lib/python3.6/site-packages/tensorflow/python/summary/writer/event_file_writer.py in __init__(self, logdir, max_queue, flush_secs, filename_suffix) 65 self._logdir = logdir 66 if not gfile.IsDirectory(self._logdir): ---> 67 gfile.MakeDirs(self._logdir) 68 self._event_queue = six.moves.queue.Queue(max_queue) 69 self._ev_writer = pywrap_tensorflow.EventsWriter( ~/anaconda3/envs/tf-1.8/lib/python3.6/site-packages/tensorflow/python/lib/io/file_io.py in recursive_create_dir(dirname) 372 """ 373 with errors.raise_exception_on_not_ok_status() as status: --> 374 pywrap_tensorflow.RecursivelyCreateDir(compat.as_bytes(dirname), status) 375 376 ~/anaconda3/envs/tf-1.8/lib/python3.6/site-packages/tensorflow/python/framework/errors_impl.py in __exit__(self, type_arg, value_arg, traceback_arg) 517 None, None, 518 compat.as_text(c_api.TF_Message(self.status.status)), --> 519 c_api.TF_GetCode(self.status.status)) 520 # Delete the underlying status object from memory otherwise it stays alive 521 # as there is a reference to status from this from the traceback due to PermissionDeniedError: Error executing an HTTP request (HTTP response code 401, error code 0, error message ''), response '{ "error": { "errors": [ { "domain": "global", "reason": "required", "message": "Anonymous caller does not have storage.objects.get access to bert_summ/models/.", "locationType": "header", "location": "Authorization" } ], "code": 401, "message": "Anonymous caller does not have storage.objects.get access to bert_summ/models/." } } ' when reading metadata of gs://bert_summ/models/
CapitalZe commented 5 years ago

I received the same error.

santhoshkolloju commented 5 years ago

the problem is i am writing it to my google cloud platform which you will not have access please change the location to your local filesystem (all gs: file paths with your local paths)

avotiis commented 5 years ago

https://github.com/santhoshkolloju/bert_summ I have replaced the Encoder part with Bert and kept the transformer decoder as it is . let me know if it helps

Do you have any examples of generated summaries?

santhoshkolloju commented 5 years ago

I cannot share the results its my own data. But I have good results it was able to copy rare words as well. Initially I tried fine tuning both encoder(bert) and decoder both because of which Bert weights got disturbed. The. I freezed the weights of Bert and just trained the decoder part. It was giving much readable and grammatically correct sentences.

akakakakakaa commented 5 years ago

@santhoshkolloju, Can you share your experience? When I use your code, 'hypotheses' always have same value on every references. for example, references: ['do', 'n', "'", 't', 'wear', 'rings', 'when', 'working', 'on', 'engine', 'internal', '##s', '.', '[PAD]', '[PAD]', '[PAD]', ...] hypotheses: ['do', 'n', "'", 't', 'try', 'to', 'do', 'n', "'", 't', 'mix', '.', '', '', '', ...] references: ['broke', 'the', 'elevators', 'at', 'work', ',', 'basically', 'shot', 'myself', 'in', 'the', 'foot', 'in', 'doing', 'so', 'because', 'all', 'our', 'heavy', 'shit', 'is', 'downstairs', '.', '[PAD]', '[PAD]', '[PAD]', ...] hypotheses: ['do', 'n', "'", 't', 'try', 'to', 'do', 'n', "'", 't', 'mix', '.', '', '', '', ...]

I used tifu dataset suggested from "Abstractive Summarization of Reddit Posts with Multi-level Memory Networks" paper.

santhoshkolloju commented 5 years ago

There was a problem.. Freeze the Bert weights and run again tf. get_trainable_variables() And exclude all the variables which starts with "bert" then pass non Bert variables to optimizer

akakakakakaa commented 5 years ago

@santhoshkolloju Sorry for my question... I tried hard freezing but I do not know what to do based on this codes.. Are you suggest how to freeze? I tried to fix run_pretraining.py using export_savedmodel and removed all tpu related code. So I create saved_model.pb file. But, loading pb file is failed..

santhoshkolloju commented 5 years ago

In the notebook I shared replace this line code like shown below and run again it should work. allvars = tf get_trainable_variables() nonBert =[v for v in allvars if 'bert' not in v]

train_op = tx.core.get_train_op( mle_loss, learning_rate=learning_rate, variables=non Bert, global_step=global_step, hparams=opt)

akakakakakaa commented 5 years ago

Thank you for your advice. I finally trained. Freezing bert encoder makes much readable and grammatically correct sentences. But Still cannot summarize well :'(.. maybe we need more technic like Pointer Generator ,Bottom-Top Summarization... etc :)

santhoshkolloju commented 5 years ago

In my case my data is some what easy one. It was not generating the sentences as it is but it is rephrasing which gives same meaning. Try training for more iterations or passing the entity information to the model

santhoshkolloju commented 5 years ago

Pointer generator is to be used when you have unknowns in the data with the subword tokenization hardly there are unknowns.

But let me know if you were able to improve on this

asquare14 commented 5 years ago

I am looking to use BERT model for abstractive text summarization, I checked out @santhoshkolloju code, will run and see, however, it would be really helpful if someone could guide me to articles/papers/resources/ code for abstractive summarization with BERT.

ajamjoom commented 5 years ago

I am looking to use BERT model for abstractive text summarization, I checked out @santhoshkolloju code, will run and see, however, it would be really helpful if someone could guide me to articles/papers/resources/ code for abstractive summarization with BERT.

Check out this paper: https://arxiv.org/pdf/1902.09243.pdf

They still haven't released their code yet, but I'm currently working on reimplementing it in PyTorch and will make the code public once I'm done with it.

HenryDashwood commented 5 years ago

I am looking to use BERT model for abstractive text summarization, I checked out @santhoshkolloju code, will run and see, however, it would be really helpful if someone could guide me to articles/papers/resources/ code for abstractive summarization with BERT.

Check out this paper: https://arxiv.org/pdf/1902.09243.pdf

They still haven't released their code yet, but I'm currently working on reimplementing it in PyTorch and will make the code public once I'm done with it.

I would be, to put it mildly, extremely interested in this!

ghost commented 5 years ago

I tried the summarization on some wiki articles. Splited the text into sentences then averaged the CLS vectors from each sentence to get the "whole text CLS vec", then just picked a few sentences that were most similar to the whole text CLS (cosine similarity). Results were interesting, but not good enough for something serious (too simple and vague i guess).

HenryDashwood commented 5 years ago

For those interested, looks like we have an implementation! https://github.com/nayeon7lee/bert-summarization

alexferrari88 commented 5 years ago

For those interested, looks like we have an implementation! https://github.com/nayeon7lee/bert-summarization

it is not complete... :(

qo4on commented 5 years ago

It's been almost half a year since BERT released. Does anybody know where to find any colab notebook which shows working summarization example?

hzhmelody commented 5 years ago

Thank you for your advice. I finally trained. Freezing bert encoder makes much readable and grammatically correct sentences. But Still cannot summarize well :'(.. maybe we need more technic like Pointer Generator ,Bottom-Top Summarization... etc :)

Could you share your exprerience about bert encoder + transformer decoder + pointer generator? I wonder whether it will summarize well with pointer generator. Thanks

qo4on commented 5 years ago

Is there anyone alive?

Santosh-Gupta commented 5 years ago

There is this paper

Fine-tune BERT for Extractive Summarization

https://arxiv.org/pdf/1903.10318.pdf

I would love a colab example as well.

betty35 commented 5 years ago

@santhoshkolloju I think the result might have something to do with the batch size? I tried to print out the batches generated by the iterator(FeedableDataIterator from texar), and despite trying to set batch_size to 32, the size of the generated batch remained 4....

Edit: Okay, I finally get it, why the batch size is always 4 train_dataset = get_dataset(processor,tokenizer,"./",max_seq_length_src,max_seq_length_tgt,4,'train',"./") eval_dataset = get_dataset(processor,tokenizer,"./",max_seq_length_src,max_seq_length_tgt,4,'eval',"./") test_dataset = get_dataset(processor,tokenizer,"./",max_seq_length_src,max_seq_length_tgt,4,'test',"./") Those lines in the colab example are the culprit...

betty35 commented 5 years ago

@Santosh-Gupta They have their code released here: https://github.com/nlpyang/BertSum, though it's not a colab example

Santosh-Gupta commented 5 years ago

hmm, any idea how to use it to end up with a function like summary_result = BertSum.summarize("Text to be summarized") ?

qo4on commented 5 years ago

There is this paper

Fine-tune BERT for Extractive Summarization

Not extractive, Abstractive example please...

Santosh-Gupta commented 5 years ago

There is this paper Fine-tune BERT for Extractive Summarization

Not extractive, Abstractive example please...

So you're looking to generate a summary, not just extract the most importance sentences?

CapitalZe commented 5 years ago

I'm looking for both, thanks for the above link. I will look into it right away.

qo4on commented 5 years ago

So you're looking to generate a summary, not just extract the most importance sentences?

Yes

KaiQiangSong commented 5 years ago

I tried a bert encoder + similar transformer decoder on generating summaries. And none of them work. I believe there are a lot of tricks I didn't realize for fine-tuning the network.

Santosh-Gupta commented 5 years ago

It looks like this repo is completed

https://github.com/nlpyang/BertSum

Also, UNILM gives some great abstractive summarization scores, maybe the best

qo4on commented 5 years ago

UNILM gives some great abstractive summarization scores

I found UNILM paper only. Do you know where can we download the model? Any code example?

Santosh-Gupta commented 5 years ago

The authors say that they are preparing a release of the code and pretrained model

swapnanilsharma commented 5 years ago

I found one useful paper which gave better performance than BERT for text summarization. paper: https://arxiv.org/pdf/1905.02450.pdf code: https://github.com/microsoft/MASS Codes are not fully ready yet although.

nlpyang commented 5 years ago

Please see our paper using BERT for both extractive and abstractive summarization

https://arxiv.org/abs/1908.08345

With code and models released at https://github.com/nlpyang/PreSumm

Utomo88 commented 5 years ago

How we can make BERT use our own ebooks (. Azw,. Epub or. Mobi) or pdf files? And also specific website example success. Com Thank you

jtomek commented 5 years ago

@nlpyang, great! How do the results look like? Could somebody post here some examples, please?