Closed lifelongeek closed 3 years ago
Hi,
This is a good point. To clarify, the input of the BART encoder is 'keywords + source article', where the keywords are from selected sentences -- the selected sentences are not used as direct input to the encoder.
Unfortunately, we don't have ablation results on not using selected sentences. I guess that removing the step of selecting sentences is probably fine at training time, while at inference time directly tag keywords from a long document may be too noisy.
Thanks for sharing interesting works & source code.
In section 2.2, greedily selected sentences from a document highly correlated with reference summary. While other sentences are expected to have a low correlation with reference summary. Selected sentences exist for both training & inference.
I wonder what is the expected pros/cons when using 'keywords + selected sentences' as input of the BART encoder instead of 'keywords + all sentences'. Do you have any ablation study results on this?