ELI5 KILT annotation - Githubissues

carriex commented 3 years ago

Hi, thanks for creating the dataset! I have two questions regarding to the annotated ELI5 data:

According to the KILT data format in the main README, the field output/provenance/meta contains task-specific data. What does this meta field contain for the ELI5 dataset? Are these the manual annotation for instances with low overlap between passages and answers mentioned in section 4? If so, could you share a bit more on how is this manual annotation performed?
I noticed that for some annotation in the field output/provenance/meta/evidence_span contain a string "highlight sentence(s) containing evidence, not only the answer" at the end, (for instance, data in ELI5 dev set with id "3atjp2", as well as "49wqfo"). What are the meanings of such string?

Thank you!

fabiopetroni commented 3 years ago

Hi @carriex,

thanks a lot for your message. The meta field in the ELI5 dev set contains exhaustive information about our annotation campaign, including the span of text that was highlighted by the annotators. Here the full guidelines for the annotation campaign:

Read the passages
Decide if the passage can answer the question

If the passage is not relevant, choose "No" (see Example 1)
If the passage contains evidence to fully answers the question, choose "Yes, sufficient to answer" (see Example 1 and 2)
If the passage contains evidence to partially answers the question, choose "Yes, partial" (see Example 3)

If the passage is relevant, highlight the relevant sentences. We are interested in the evidence needed to answer the question, not only the answer.
A set of valid answers (separated with semicolons) are provided for your benefit, however please base your decision primarily on the question since there could be other valid answers (not reported) and reported answers could be wrong. We are interested in the evidence needed to provide a valid answer for the question.
Consider the Wikipedia article title and section as additional contextual information for your decision (see Example 2)
We consider partial evidence a sentence that contains some evidence that's helpful for answering the question, but the answer cannot be fully worked out by this sentence by itself.
Containing one of the answers is not enough for a sentence to be considered partial evidence (see Example 1).
There could be some typos - please use your common sense to resolve those.

I hope this helps!

carriex commented 3 years ago

Hi @fabiopetroni,

thanks for providing the detailed explanation for how the annotation campaign was carried out! I have two follow-up clarification questions:

For the field containing exhausive information of the annotation campaign, are you referring to the meta field in the below example, which includes a partial evidence? Also it looks like the partial evidence comes from a different wikipedia page than the one in output/provenance. However from the eval_retrieval.py it looks like we are still using the wikipedia page in output/provenance to evaluate the retrieval performance. Is this understanding correct?

{'id': '1kiwfx', 'input': 'In Trading Places (1983, Akroyd/Murphy) how does the scheme at the end of the movie work? Why would buying a lot of OJ at a high price ruin the Duke Brothers?', 'meta': {'left_context': '', 'mention': '', 'obj_surface': {'text': array([], dtype=object)}, 'partial_evidence': {'end_paragraph_id': array([7], dtype=int32), 'meta': array([{'evidence_span': array(['On television, they learn that Clarence Beeks is transporting a secret USDA report on orange crop forecasts.', 'On television, they learn that Clarence Beeks is transporting a secret USDA report on orange crop forecasts. Winthorpe and Valentine recall large payments made to Beeks by the Dukes and realize that the Dukes plan to obtain the report to corner the market on frozen orange juice.', 'Winthorpe and Valentine recall large payments made to Beeks by the Dukes and realize that the Dukes plan to obtain the report to corner the market on frozen orange juice.'], dtype=object)} ], dtype=object), 'section': array(['Section::::Plot.\n'], dtype=object), 'start_paragraph_id': array([7], dtype=int32), 'title': array(['Trading Places'], dtype=object), 'wikipedia_id': array(['520990'], dtype=object)}, 'right_context': '', 'sub_surface': {'text': array([], dtype=object)}, 'subj_aliases': {'text': array([], dtype=object)}, 'template_questions': {'text': array([], dtype=object)}}, 'output': {'answer': array(['The final scene involves future contracts. ..."what happens at the end of Trading Places?"', ''], dtype=object), 'meta': array([], dtype=object), 'provenance': array([{'bleu_score': array([0.92328084], dtype=float32), 'end_character': array([612], dtype=int32), 'end_paragraph_id': array([1], dtype=int32), 'meta': array([], dtype=object), 'section': array(['Section::::Abstract.'], dtype=object), 'start_character': array([14], dtype=int32), 'start_paragraph_id': array([1], dtype=int32), 'title': array(['Futures contract'], dtype=object), 'wikipedia_id': array(['242855'], dtype=object)}], dtype=object)}
My original question is actually around the meta field inside output/provenance. For example in the instance below (bolded), there is such a field. However for the example above, the annotation only contains start/end character/paragraph of the wikipedia page. I'm wondering what is the difference between these two kinds of annotations?

{'id': '3atjp2', 'input': 'what are benefits of TPP ?', 'meta': {'left_context': '', 'mention': '', 'obj_surface': {'text': array([], dtype=object)}, 'partial_evidence': {'end_paragraph_id': array([], dtype=int32), 'meta': array([], dtype=object), 'section': array([], dtype=object), 'start_paragraph_id': array([], dtype=int32), 'title': array([], dtype=object), 'wikipedia_id': array([], dtype=object)}, 'right_context': '', 'sub_surface': {'text': array([], dtype=object)}, 'subj_aliases': {'text': array([], dtype=object)}, 'template_questions': {'text': array([], dtype=object)}}, 'output': {'answer': array(['The TPP is a trade liberalization treaty...why would FR/UK/NZ etc. want to sign it France and the UK are not part of TPP. That's TTIP, a similar but separate deal.", ''], dtype=object), 'meta': array([], dtype=object), 'provenance': array([{'bleu_score': array([0.], dtype=float32), 'end_character': array([-1], dtype=int32), 'end_paragraph_id': array([1], dtype=int32), 'meta': array([{'annotation_id': '-1', 'evidence_span': {'text': array(['Theory of Motivated Information Management or TMIM, is a social-psychological framework that examines the relationship between information management and uncertainty. The theory posits that individuals are motivated to manage their uncertainty levels when they perceive a discrepancy between the level of uncertainty they have about an important issue and the level of uncertainty they want. In other words, someone may be uncertain about an important issue but decides not to engage or seek information because they are comfortable with that state.\rhighlight sentence(s) containing evidence, not only the answer', 'Theory of Motivated Information Management or TMIM, is a social-psychological framework that examines the relationship between information management and uncertainty. The theory posits that individuals are motivated to manage their uncertainty levels when they perceive a discrepancy between the level of uncertainty they have about an important issue and the level of uncertainty they want. In other words, someone may be uncertain about an important issue but decides not to engage or seek information because they are comfortable with that state.'], dtype=object)}, 'fever_page_id': '', 'fever_sentence_id': -1, 'yes_no_answer': ''} ], dtype=object), 'section': array(['Section::::Abstract.'], dtype=object), 'start_character': array([-1], dtype=int32), 'start_paragraph_id': array([1], dtype=int32), 'title': array(['Theory of Motivated Information Management'], dtype=object), 'wikipedia_id': array(['36119336'], dtype=object)} ], dtype=object)}}

Again, thanks very much for your help!

fabiopetroni commented 3 years ago

we don't consider partial evidence in KILT. A wikipedia page is added to the output only if we find the majority of annotators indicating that as containing full evidence. So there might be other pages with partial evidence in meta.
I don't fully get the question. In meta we report the complete annotation information, including partial evidence annotation and evidence span if available (even if we ignore this information in the evaluation).

carriex commented 3 years ago

thanks for getting back so quickly! The answer for the first question makes sense to me.

For the second question, I'm referring to the fields inside output/provenance. For the two examples above, the question with id 1kiwfx doesn't contain an evidence span in output/provenance, but only start_character, end_character, start_paragraph and end_paragraph. However for the question with id 3atjp2, there are two evidence spans inside output/provenance, whereas start_character and end_character contains value -1. Does it mean the annotator highlighted an evidence span for 3atjp2, but only selected "Yes, sufficient to answer" for the passage in 1kiwfx (without highlighting any evidence span)? Sorry for any confusion caused, let me know if this is clear to you!

fabiopetroni commented 3 years ago

I see. So sometimes the evidence span is given as offset in a paragraph within the knowledge source, sometimes as a string (probably there the automatic script failed). I hope this helps :)

carriex commented 3 years ago

@fabiopetroni got it! thank you so much for your help! :)

facebookresearch / KILT

ELI5 KILT annotation #44