microsoft / unilm

Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalities
https://aka.ms/GeneralAI
MIT License
20.21k stars 2.55k forks source link

LayoutLMV2/LayoutXLM DocVQA Evaluation #634

Closed oguz-akkas-deepsee closed 2 years ago

oguz-akkas-deepsee commented 2 years ago

Describe Model I am using (UniLM, MiniLM, LayoutLM ...): LayoutLMV2 and LayoutXLM models for DocVQA task.

I am wondering what metrics is being used for the evaluation. The original QA tasks use exact match and F1 score and transformer library provides metrics for squad tasks based on exact match and F1 score. But it requires some additional ground truth data. I am wondering anyone tried DocVQA with LayoutLMV2 or LayoutXLM and applied metrics.

wolfshow commented 2 years ago

@oguz-akkas-deepsee The metrics for DocVQA is introduced at https://rrc.cvc.uab.es/?ch=17&com=tasks