Processing CNN/DM dataset

facebookresearch / fairseq

Facebook AI Research Sequence-to-Sequence Toolkit written in Python.

MIT License

30.3k stars 6.39k forks source link

Processing CNN/DM dataset #1753

Closed lioutasb closed 4 years ago

lioutasb commented 4 years ago

I'm trying to reproduce the results from the Lightweight convolution paper on the abstractive summarization. I'm looking for a script or the steps to process the CNN/DM dataset with Fairseq. I tried to google it but I wasn't able to find any information on how to preprocess the data. The raw data contain multi-sentence summaries so I'm not sure how I'm supposed to handle them and let the decoder generate all of them.

Any help is appreciated.

SeolhwaLee commented 4 years ago

Hi @lioutasb, I have same problem. Do you solve this?