Describe
Model I am using (UniLM, MiniLM, LayoutLM ...): LayoutLMV2 and LayoutXLM models for DocVQA task.
I am wondering what metrics is being used for the evaluation. The original QA tasks use exact match and F1 score and transformer library provides metrics for squad tasks based on exact match and F1 score. But it requires some additional ground truth data. I am wondering anyone tried DocVQA with LayoutLMV2 or LayoutXLM and applied metrics.
Describe Model I am using (UniLM, MiniLM, LayoutLM ...): LayoutLMV2 and LayoutXLM models for DocVQA task.
I am wondering what metrics is being used for the evaluation. The original QA tasks use exact match and F1 score and transformer library provides metrics for squad tasks based on exact match and F1 score. But it requires some additional ground truth data. I am wondering anyone tried DocVQA with LayoutLMV2 or LayoutXLM and applied metrics.