Different PGN results - Githubissues

Alex-Fabbri / Multi-News

Large-scale multi-document summarization dataset and code

Other

274 stars 53 forks source link

Different PGN results #9

Closed liuqingli closed 5 years ago

liuqingli commented 5 years ago

I converted the Multi-News dataset (Preprocessed, but not truncated, data) into the original PGN input (Abisee, 2017), but got much lower ROUGE scores, different from the ones mentioned in the Multi-News paper. ROUGE(1/2/3/L/SU4): 31.98/10.43/5.77/28.98/9.66

Since the CNNDM dataset works well with the original PGN model, I doubt there might be something wrong with the data conversion procedure.

@Alex-Fabbri Could you let me know how you preprocessed the multi-news dataset for PGN. Did you add SENT_START and SENT_END as sentence tags? Or is there anything else that I need to pay attention to?

Thanks very much.

Alex-Fabbri commented 5 years ago

The original PGN implementation will truncate the input to 400 tokens, but these are just the first 400 tokens in the concatenation of all documents (and won't cover all source documents), while our truncation takes input from each individual document in the input (and uses 500 tokens). Since the summaries cite all the sources and you're not including all the sources, this is likely one of the factors influencing the score. I'm curious what the result is when you run on our truncation.

Other than NEWLINE_CHAR for newline characters and the separator for separating input documents we don't have any special separator tags (we just use the OpenNMT "\<s>" and "\</s>" at the beginning and end of the entire input).

liuqingli commented 5 years ago

Thanks Alex! I have run both "multi-news" and "preprocessed_truncated" datasets with PGN, the rouge scores are similar (31.98 for "multi-news" and 33.39 for "preprocessed_truncated") I think one issue is my encoding length is still 400 by default. Regarding the <s> and </s> pairs, do you mean you add them to the input? By following the make_datafiles.py in PGN, I added them to each sentence of the golden summaries.

Alex-Fabbri commented 5 years ago

We didn't experiment much with the length but that's possible. Yes we just followed the standard way OpenNMT does beginning-of-sequence and end-of-sequence here

liuqingli commented 5 years ago

Thanks a lot! I will check that and let you know the updates.

Alex-Fabbri commented 5 years ago

Closing for now but please open again once you have an update.