Open changzhisun opened 4 years ago
Note that in the eval script by default we soft match spans (90% character F1). This should result in the prediction "chris griffin" being considered correct with the reference "chris griffin's".
Note that in the eval script by default we soft match spans (90% character F1). This should result in the prediction "chris griffin" being considered correct with the reference "chris griffin's".
I checked the qed_eval.py, and found the following code
if pred_entity.normalized_text == annot_entity.normalized_text:
if overlap(pred_entity, annot_entity):
found = True
break
if pred_q_ent.normalized_text == annot_q_ent.normalized_text:
if annot_doc_ent.normalized_text == pred_doc_ent.normalized_text:
if overlap(pred_q_ent, annot_q_ent):
if overlap(pred_doc_ent, annot_doc_ent):
found = True
break
Normalized text of "chris griffin" and "chris griffin's" are not equal ("chrisgriffin" and "chrisgriffins"). Can it be changed to the following ?
if (annot_entity.normalized_text in pred_entity.normalized_text) or (pred_entity.normalized_text in annot_entity.normalized_text):
Or delete the if pred_entity.normalized_text == annot_entity.normalized_text:
directly?
I found the coreference annotation ("question_reference") may not be words. For example, the question_text is "who does chris griffin's voice on family guy", but the question_reference is "chris griffin". In my preprocessing,"griffins‘s" is a word. If my model find it, is this correct in the evaluation?