X-PLUG / mPLUG-DocOwl

mPLUG-DocOwl: Modularized Multimodal Large Language Model for Document Understanding
Apache License 2.0
1.12k stars 68 forks source link

About TextVQA metric result #55

Closed SWHL closed 2 months ago

SWHL commented 2 months ago

Recently, I try to reproduct the metric result of TextVQA datasets. When I run the evaluation code in TextVQA, and I got the TextVQA_test_pred_official_eval.json. According the instruction, I submit it to the official website , image

The Stderr file shows:

Traceback (most recent call last):
  File "/code/scripts/workers/submission_worker.py", line 538, in run_submission
    submission_metadata=submission_serializer.data,
  File "/tmp/tmp48rl7bfv/compute/challenge_data/challenge_830/main.py", line 202, in evaluate
    prepare_objects(annFile, resFile, phase_codename)
  File "/tmp/tmp48rl7bfv/compute/challenge_data/challenge_830/main.py", line 109, in prepare_objects
    vqaRes = vqa.loadRes(res, resFile)
  File "/tmp/tmp48rl7bfv/compute/challenge_data/challenge_830/vqa.py", line 170, in loadRes
    'Results do not correspond to current VQA set. Either the results do not have predictions for all question ids in annotation file or there is atleast one question id that does not belong to the question ids in the annotation file. Please note that this year, you need to upload predictions on ALL test questions for test-dev evaluation unlike previous years when you needed to upload predictions on test-dev questions only.'
AssertionError: Results do not correspond to current VQA set. Either the results do not have predictions for all question ids in annotation file or there is atleast one question id that does not belong to the question ids in the annotation file. Please note that this year, you need to upload predictions on ALL test questions for test-dev evaluation unlike previous years when you needed to upload predictions on test-dev questions only.

I had tried both Test-Dev Phase and Test-Standard Phase, and met the same error.

So I want to ask for help, how do you evaluate this dataset?

This is the json file: TextVQA_test_pred_official_eval.json

Thanks.

HAWLYQ commented 2 months ago

Hi, @SWHL This is the VQA challenge, not the TextVQA challenge... try https://eval.ai/web/challenges/challenge-page/874/submission

SWHL commented 2 months ago

Thank you very much, the issue has been resolved. Great job, I am able to replicate all 10 dataset metrics I tested.