Closed BlackFeetMouse closed 4 years ago
Thanks for your feedback! For training the model, we require content plan, box-score table and game summary.
Thanks for your reply!
The paper mentioned the content plan was extracted by information extraction approach (Wiseman et al. 2017) and the type of each data record was predicted. I am wondering whether it will cause content plan contains erroneous data too.
Could you tell me whether the content plan used for generating the text was extracted by IE approach too and whether that content plan might contain erroneous data items?
Thank you so much for your time and consideration.
In an ideal scenario, we should have a gold content plan annotated by humans. As this is not feasible, we make use of (silver) content plan. Yes, this content plan is obtained by running the IE approach on the gold training summaries. Indeed, it contains erroneous data items too sometimes. We had made a study of the accuracy of IE on a held-out set in our paper. From our paper: "On held-out data it achieved 94% accuracy, and recalled approximately 80% of the relations licensed by the records." During inference, we predict the content plan and generate the summary based on the predicted content plan.
I greatly appreciate your assistance with my question!
Hello, thank you so much for sharing such a nice project.
I have read the paper and I want to ask one question about content plan.
Could you tell me whether the data used for training the model included content plan? Or it only used original ROTOWIRE (Wiseman et al. 2017) game summary and box-score dataset?
Thank you so much.