Closed dennlinger closed 1 year ago
As discussed with Svea Klaus from the EUR-LexSum dataset, it will be helpful to know which kind of duplication may occur.
Now introduces four types:
exact_duplicate
(reference, summary)
both_duplicate
reference_duplicate
summary_duplicate
As discussed with Svea Klaus from the EUR-LexSum dataset, it will be helpful to know which kind of duplication may occur.
Now introduces four types:
exact_duplicate
, where the exact combination of(reference, summary)
has been encountered before.both_duplicate
, where both the reference and summary have been encountered before, but separately and not together.reference_duplicate
, where only the reference has been encountered before.summary_duplicate
, where only the summary has been encountered before.