Closed HamzaBjitro closed 3 years ago
@HamzaBjitro There are currently no pre-trained models that can be used to abstractively summarize long documents. Models listed in the BART Converted to LongformerEncoderDecoder section need to be fine-tuned on a long document summarization dataset, such as Arxiv-PubMed, to create a model that can summarize long sequences. The ArXiv-PubMed models will be trained as soon as I obtain the resources necessary to train them (2 Tesla V100 GPUs).
I've updated the documentation to reflect this.
This issue is a duplicate of #36 but is explained more clearly here so discussion will continue here.
I was trying to follow the instructions in https://transformersum.readthedocs.io/en/latest/general/getting-started.html but they don't make sense. The Drive link contains .bin files while the "model = AbstractiveSummarizer.load_from_checkpoint("path/to/ckpt/file") " needs ckpt files. I have tried to use "LongformerEncoderDecoderForConditionalGeneration.from_pretrained()" but I can't use the model to create summaries. All I want is to test a pre-trained model on a long document.Can you please guide me on how to do so?