Thank you for sharing the code.
I tested the extractive setting on a different summarization dataset and found there are at most 3 sentences output for each sample. It may meet the requirements of the CNN/DM dataset, but may not be suitable for other dataset, where the target summaries can be longer than 3 sentneces.
So I suggest to modify the code in trainer_ext.py#L275, and use the hyper-parameter self.args.max_tgt_len to control the length of output sequence.
https://github.com/nlpyang/PreSumm/blob/ce8dc017fbef7c12b1b4bd764f0c3d20911ead5e/src/models/trainer_ext.py#L275
Thank you for sharing the code. I tested the extractive setting on a different summarization dataset and found there are at most 3 sentences output for each sample. It may meet the requirements of the CNN/DM dataset, but may not be suitable for other dataset, where the target summaries can be longer than 3 sentneces. So I suggest to modify the code in trainer_ext.py#L275, and use the hyper-parameter self.args.max_tgt_len to control the length of output sequence. https://github.com/nlpyang/PreSumm/blob/ce8dc017fbef7c12b1b4bd764f0c3d20911ead5e/src/models/trainer_ext.py#L275