HHousen / TransformerSum

Models to perform neural summarization (extractive and abstractive) using machine learning transformers and a tool to convert abstractive summarization datasets to the extractive task.
https://transformersum.rtfd.io
GNU General Public License v3.0
427 stars 58 forks source link

Summarizing a long document #38

Closed HamzaBjitro closed 3 years ago

HamzaBjitro commented 3 years ago

I was trying to follow the instructions in https://transformersum.readthedocs.io/en/latest/general/getting-started.html but they don't make sense. The Drive link contains .bin files while the "model = AbstractiveSummarizer.load_from_checkpoint("path/to/ckpt/file") " needs ckpt files. I have tried to use "LongformerEncoderDecoderForConditionalGeneration.from_pretrained()" but I can't use the model to create summaries. All I want is to test a pre-trained model on a long document.Can you please guide me on how to do so?

HHousen commented 3 years ago

@HamzaBjitro There are currently no pre-trained models that can be used to abstractively summarize long documents. Models listed in the BART Converted to LongformerEncoderDecoder section need to be fine-tuned on a long document summarization dataset, such as Arxiv-PubMed, to create a model that can summarize long sequences. The ArXiv-PubMed models will be trained as soon as I obtain the resources necessary to train them (2 Tesla V100 GPUs).

I've updated the documentation to reflect this.

HHousen commented 3 years ago

This issue is a duplicate of #36 but is explained more clearly here so discussion will continue here.