m2e2 evaluation - Githubissues

Hi,

Thank you for sharing your work. However, I think the evaluation of the m2e2 dataset is not totally clear.

In your paper, in table 3, there are the evaluation results.

Screenshot 2024-03-27 at 12 32 55

When considering only the multimedia training row, therefore those models have as input a multimodal document, what does "Text-Only Evaluation" mean? Is it evaluating on the "text_only_event.json" file? Is it evaluating in both "text_only_event.json" and "text_multimedia_event.json" files?

Thanks.

limanling / m2e2

m2e2 evaluation #26