Closed mainulpolash closed 5 years ago
Actually, I wanted to do query focused abstractive summarization using transformer model.
How will you be using the query? Will the query select some relevant portion of a given passage and then the model will summarize that portion?
Yeah, it is kind of that. Here is an example:
Document: "archbishop john foley a vatican spokesman says in : i know it would be an insult to the priests who have remained faithful to readmit these persons who have left the priesthood in ordering to marry." Query: is it wrong to maintain the tradition of Catholic priest celibacy? Summary: allowing priests to marrying insults celibacy.
You would first need a relevant dataset; but I don't think there really is a publicly available dataset specifically for query focused summarization. But what can be done is 1) Train a QA model\Reading-comprehension model - where the model will learn to extract a portion of the text based on a query 2) Train a abstractive summarization model 3) After training is done, put the output (the extracted text) of the QA model into model 2 (summarization model).
If you somehow get data specifically for query focused summarization, then ideally you want to create a reading-comprehension-QA inspired encoder that would encode both the query and the document (preferably with bi-directional attention and all that). Then use a decoder (RNN/Transformer) on the encoded portion (preferably with interlayer attention) for abstractive summarization. (in this way you can train query focused summarization in a straight forward end to end fashion).
Also found this paper from googling (haven't read it but may be relevant): https://arxiv.org/abs/1801.07704
A relevant dataset is available on query focused abstractive summarization. This is the link: https://arxiv.org/abs/1704.08300. I was trying to use your model. Encode both the query and passage using the encoder and merger both the encoder output, then decode. But I'm facing problem with the shape of transformer while doing this. Is it possible to do in the way I am thinking? Or do you have any better suggestion for me?
Yes, your overall idea sounds fine. But if it will work or not depends on the details of your implementation. Are you trying to go for transformers for both encoder and decoder?
Where exactly are you facing problem with the transformer?
If you are using transformers as encoder, how are you exactly merging?
One suggestion would be to allow some interaction (potentially attention) in between query and passage while encoding them, if you aren't already.
You mean something like question-answering? https://rajpurkar.github.io/SQuAD-explorer/ Usually this is done by encoding both query and the passage (from which the answer is to be extracted (or abstracted))
I tried to implement models for that long ago: https://github.com/JRC1995/QA-bAbi-R_NET https://github.com/JRC1995/Dynamic-Memory-Network-Plus
They aren't really tested though, and Dynamic Memory Net didn't work as in the paper.
Anyway, you may find better approaches in other repos.
Here's a relatively recent paper on QA: https://papers.nips.cc/paper/7739-densely-connected-attention-propagation-for-reading-comprehension.pdf