in the big stories, there is inconsistencies between the input file and the xmls for gold annotations. It turns out that not all tweets are annotated.
In Burma event, only 63 out of 78 tweets in the input file occur in the xml, and annotated with entities and propositions.
This results in false precision errors, since our predicted graph parses all tweets in the input file.
We should modify the evaluation to take it under consideration - only send to the parser tweets that occur in the corresponding gold file.
in the big stories, there is inconsistencies between the input file and the xmls for gold annotations. It turns out that not all tweets are annotated. In Burma event, only 63 out of 78 tweets in the input file occur in the xml, and annotated with entities and propositions. This results in false precision errors, since our predicted graph parses all tweets in the input file.
We should modify the evaluation to take it under consideration - only send to the parser tweets that occur in the corresponding gold file.