Open erastogi opened 4 months ago
Hi, have you figured this out?
I'm also interested in this
Hi! The quote from the paper means that the reference summary (the one from the original dataset) is also one of the 5 summaries being scored by the annotator. It is not being used to score the other summaries. Let us know if you have other questions!
I am looking at this example:
If I look at the reference summary (referred to as "highlights" in the original dataset), then I feel the given system output (aka machine summary) is which very much relevant/similar to the reference summary. Hence, a score of 1 for relevance by the expert doesn't make sense.
In the paper, you have mentioned -
I was wondering which "reference" summary are you referring to in this section. Is it the one from the original dataset or one of the 10 summaries generated by the humans. If latter, is there a way to get which reference summary was used by the expert to score a given machine summary.