Open quanghuynguyen1902 opened 5 years ago
You need to format your data according to the CNN or DM dataset. It will work. Else modify the tokenizing file according to your data. Thats the solution. It is not an issue as well.
It looks like we just need to put the source, followed by the summary in each line, separated by a unique token?
if I have the content of article that is not of CNN or DM. How will I process data?