Open ghost opened 5 years ago
I am also interested in seeking a reply to the above question https://github.com/google-research/bert/issues/352#issue-398233998 Kindly do reply Thanks
Are these 11 tasks listed here ? : https://ai.googleblog.com/2019/01/looking-back-at-googles-research.html?m=1
I know the eleven tasks but wanted to know if anyone has used this for abstractive text summarization?
I did extractive summarization. After getting embedding I clustered them and took 1 sentence from each cluster. What are steps of abstractive summarization? Let me give a try.. On Jan 26, 2019 12:25 PM, "makamkkumar" notifications@github.com wrote:
I know the eleven tasks but wanted to know if anyone has used this for abstractive text summarization?
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/google-research/bert/issues/352#issuecomment-457808234, or mute the thread https://github.com/notifications/unsubscribe-auth/AeIVIPSFlUFGu3ZJtMO2V2jZD5_iLuBIks5vG_vTgaJpZM4Z7N9Q .
https://github.com/santhoshkolloju/bert_summ I have replaced the Encoder part with Bert and kept the transformer decoder as it is . let me know if it helps
I think you are using Google.colab. I want to run the same on a local machine which is having P4000 GPU with 8GB RAM it is modest but I think suffices my requirements. However i am unable to run it here. Can you tell me how to do a work around. Thanks in advance
What is the error you get.. By default texar places all the tensors on gpu
### While running this block i.e. the last block
_#tx.utils.maybe_create_dir(model_dir)
model_dir = "gs://bert_summ/models/"uncased_L-12_H-768_A-12/bert_model.ckpt logging_file= "logging.txt" logger = utils.get_logger(logging_file) with tf.Session() as sess: sess.run(tf.global_variables_initializer()) sess.run(tf.local_variables_initializer()) sess.run(tf.tables_initializer())
smry_writer = tf.summary.FileWriter(model_dir, graph=sess.graph)
if run_mode == 'train_and_evaluate':
logger.info('Begin running with train_and_evaluate mode')
if tf.train.latest_checkpoint(model_dir) is not None:
logger.info('Restore latest checkpoint in %s' % model_dir)
saver.restore(sess, tf.train.latest_checkpoint(model_dir))
iterator.initialize_dataset(sess)
step = 5000
for epoch in range(max_train_epoch):
iterator.restart_dataset(sess, 'train')
step = _train_epoch(sess, epoch, step, smry_writer)
elif run_mode == 'test':
logger.info('Begin running with test mode')
logger.info('Restore latest checkpoint in %s' % model_dir)
saver.restore(sess, tf.train.latest_checkpoint(model_dir))
_eval_epoch(sess, 0, mode='test')
else:
raise ValueError('Unknown mode: {}'.format(run_mode))_
### The error I am getting is:-
PermissionDeniedError Traceback (most recent call last)
I received the same error.
the problem is i am writing it to my google cloud platform which you will not have access please change the location to your local filesystem (all gs: file paths with your local paths)
https://github.com/santhoshkolloju/bert_summ I have replaced the Encoder part with Bert and kept the transformer decoder as it is . let me know if it helps
Do you have any examples of generated summaries?
I cannot share the results its my own data. But I have good results it was able to copy rare words as well. Initially I tried fine tuning both encoder(bert) and decoder both because of which Bert weights got disturbed. The. I freezed the weights of Bert and just trained the decoder part. It was giving much readable and grammatically correct sentences.
@santhoshkolloju, Can you share your experience? When I use your code, 'hypotheses' always have same value on every references. for example, references: ['do', 'n', "'", 't', 'wear', 'rings', 'when', 'working', 'on', 'engine', 'internal', '##s', '.', '[PAD]', '[PAD]', '[PAD]', ...] hypotheses: ['do', 'n', "'", 't', 'try', 'to', 'do', 'n', "'", 't', 'mix', '.', '', '', '', ...] references: ['broke', 'the', 'elevators', 'at', 'work', ',', 'basically', 'shot', 'myself', 'in', 'the', 'foot', 'in', 'doing', 'so', 'because', 'all', 'our', 'heavy', 'shit', 'is', 'downstairs', '.', '[PAD]', '[PAD]', '[PAD]', ...] hypotheses: ['do', 'n', "'", 't', 'try', 'to', 'do', 'n', "'", 't', 'mix', '.', '', '', '', ...]
I used tifu dataset suggested from "Abstractive Summarization of Reddit Posts with Multi-level Memory Networks" paper.
There was a problem.. Freeze the Bert weights and run again tf. get_trainable_variables() And exclude all the variables which starts with "bert" then pass non Bert variables to optimizer
@santhoshkolloju Sorry for my question... I tried hard freezing but I do not know what to do based on this codes.. Are you suggest how to freeze? I tried to fix run_pretraining.py using export_savedmodel and removed all tpu related code. So I create saved_model.pb file. But, loading pb file is failed..
In the notebook I shared replace this line code like shown below and run again it should work. allvars = tf get_trainable_variables() nonBert =[v for v in allvars if 'bert' not in v]
train_op = tx.core.get_train_op( mle_loss, learning_rate=learning_rate, variables=non Bert, global_step=global_step, hparams=opt)
Thank you for your advice. I finally trained. Freezing bert encoder makes much readable and grammatically correct sentences. But Still cannot summarize well :'(.. maybe we need more technic like Pointer Generator ,Bottom-Top Summarization... etc :)
In my case my data is some what easy one. It was not generating the sentences as it is but it is rephrasing which gives same meaning. Try training for more iterations or passing the entity information to the model
Pointer generator is to be used when you have unknowns in the data with the subword tokenization hardly there are unknowns.
But let me know if you were able to improve on this
I am looking to use BERT model for abstractive text summarization, I checked out @santhoshkolloju code, will run and see, however, it would be really helpful if someone could guide me to articles/papers/resources/ code for abstractive summarization with BERT.
I am looking to use BERT model for abstractive text summarization, I checked out @santhoshkolloju code, will run and see, however, it would be really helpful if someone could guide me to articles/papers/resources/ code for abstractive summarization with BERT.
Check out this paper: https://arxiv.org/pdf/1902.09243.pdf
They still haven't released their code yet, but I'm currently working on reimplementing it in PyTorch and will make the code public once I'm done with it.
I am looking to use BERT model for abstractive text summarization, I checked out @santhoshkolloju code, will run and see, however, it would be really helpful if someone could guide me to articles/papers/resources/ code for abstractive summarization with BERT.
Check out this paper: https://arxiv.org/pdf/1902.09243.pdf
They still haven't released their code yet, but I'm currently working on reimplementing it in PyTorch and will make the code public once I'm done with it.
I would be, to put it mildly, extremely interested in this!
I tried the summarization on some wiki articles. Splited the text into sentences then averaged the CLS vectors from each sentence to get the "whole text CLS vec", then just picked a few sentences that were most similar to the whole text CLS (cosine similarity). Results were interesting, but not good enough for something serious (too simple and vague i guess).
For those interested, looks like we have an implementation! https://github.com/nayeon7lee/bert-summarization
For those interested, looks like we have an implementation! https://github.com/nayeon7lee/bert-summarization
it is not complete... :(
It's been almost half a year since BERT released. Does anybody know where to find any colab notebook which shows working summarization example?
Thank you for your advice. I finally trained. Freezing bert encoder makes much readable and grammatically correct sentences. But Still cannot summarize well :'(.. maybe we need more technic like Pointer Generator ,Bottom-Top Summarization... etc :)
Could you share your exprerience about bert encoder + transformer decoder + pointer generator? I wonder whether it will summarize well with pointer generator. Thanks
Is there anyone alive?
There is this paper
Fine-tune BERT for Extractive Summarization
https://arxiv.org/pdf/1903.10318.pdf
I would love a colab example as well.
@santhoshkolloju I think the result might have something to do with the batch size? I tried to print out the batches generated by the iterator(FeedableDataIterator from texar), and despite trying to set batch_size to 32, the size of the generated batch remained 4....
Edit:
Okay, I finally get it, why the batch size is always 4
train_dataset = get_dataset(processor,tokenizer,"./",max_seq_length_src,max_seq_length_tgt,4,'train',"./")
eval_dataset = get_dataset(processor,tokenizer,"./",max_seq_length_src,max_seq_length_tgt,4,'eval',"./")
test_dataset = get_dataset(processor,tokenizer,"./",max_seq_length_src,max_seq_length_tgt,4,'test',"./")
Those lines in the colab example are the culprit...
@Santosh-Gupta They have their code released here: https://github.com/nlpyang/BertSum, though it's not a colab example
hmm, any idea how to use it to end up with a function like summary_result = BertSum.summarize("Text to be summarized") ?
There is this paper
Fine-tune BERT for Extractive Summarization
Not extractive, Abstractive example please...
There is this paper Fine-tune BERT for Extractive Summarization
Not extractive, Abstractive example please...
So you're looking to generate a summary, not just extract the most importance sentences?
I'm looking for both, thanks for the above link. I will look into it right away.
So you're looking to generate a summary, not just extract the most importance sentences?
Yes
I tried a bert encoder + similar transformer decoder on generating summaries. And none of them work. I believe there are a lot of tricks I didn't realize for fine-tuning the network.
It looks like this repo is completed
https://github.com/nlpyang/BertSum
Also, UNILM gives some great abstractive summarization scores, maybe the best
UNILM gives some great abstractive summarization scores
I found UNILM paper only. Do you know where can we download the model? Any code example?
The authors say that they are preparing a release of the code and pretrained model
I found one useful paper which gave better performance than BERT for text summarization. paper: https://arxiv.org/pdf/1905.02450.pdf code: https://github.com/microsoft/MASS Codes are not fully ready yet although.
Please see our paper using BERT for both extractive and abstractive summarization
https://arxiv.org/abs/1908.08345
With code and models released at https://github.com/nlpyang/PreSumm
How we can make BERT use our own ebooks (. Azw,. Epub or. Mobi) or pdf files? And also specific website example success. Com Thank you
@nlpyang, great! How do the results look like? Could somebody post here some examples, please?
BERT is designed to solve 11 NLP problems. Which includes text summarization.
Is there any example how can we use BERT for summarizing a document? An approach would do and and example code would be really great.
Thanks in advance