I'm trying to reproduce the results from the Lightweight convolution paper on the abstractive summarization. I'm looking for a script or the steps to process the CNN/DM dataset with Fairseq. I tried to google it but I wasn't able to find any information on how to preprocess the data. The raw data contain multi-sentence summaries so I'm not sure how I'm supposed to handle them and let the decoder generate all of them.
I'm trying to reproduce the results from the Lightweight convolution paper on the abstractive summarization. I'm looking for a script or the steps to process the CNN/DM dataset with Fairseq. I tried to google it but I wasn't able to find any information on how to preprocess the data. The raw data contain multi-sentence summaries so I'm not sure how I'm supposed to handle them and let the decoder generate all of them.
Any help is appreciated.