EvolvingLMMs-Lab / lmms-eval

Accelerating the development of large multimodal models (LMMs) with lmms-eval
https://lmms-lab.github.io/
Other
1.03k stars 53 forks source link

关于CMMMU生成结果截断的问题 #48

Open lucasjinreal opened 2 months ago

lucasjinreal commented 2 months ago

生成的结果中,有很多这样的截断:

 {
            "doc_id": 65,
            "target": "C",
            "doc": {
                "id": "2526",
                "type": "选择",
                "source_type": "website",
                "source": "https://wenku.baidu.com/view/d35b9cf950ea551810a6f524ccbff121dc36c56c.html?_wkts_=1701758578757",
                "question": "歌剧《唐璜》的曲作者是________。<img=\"q_02526_001.jpg\">--",
                "option1": "莫奈",
                "option2": "伦勃朗",
                "option3": "莫扎特",
                "option4": "罗丹",
                "answer": "C",
                "analysis": null,
                "distribution": "本科",
                "difficulty_level": "easy",
                "subcategory": "艺术理论",
                "category": "艺术与设计",
                "subfield": "['歌剧', '音乐剧研究']",
                "img_type": "['海报']",
                "image_1_filename": "q_02526_001.jpg",
                "image_2_filename": null,
                "image_3_filename": null,
                "image_4_filename": null,
                "image_5_filename": null
            },
            "arguments": [
                [
                    "请回答以下多项选择题,并选出正确选项。这些题目可能包括单选和多选题型。如果所提供的信息不足以确定一个明确的答案,那么请根据可用的数据和你的判断来选择最可能正确的选项。\n\n问题:歌剧《唐璜》的曲作者是________。<图片 1>--\n选项:\n(A) 莫奈\n(B) 伦勃朗\n(C) 莫扎特\n(D) 罗丹\n\n正确答案:\n",
                    65,
                    "cmmmu_val",
                    "val"
                ]
            ],
            "resps": [
                [
                    "《唐璜》是歌剧作品,其曲作者是威尔第("
                ]
            ],
            "filtered_resps": [
                "《唐璜》是歌剧作品,其曲作者是威尔第("
            ],
            "cmmmu_acc": {
                "id": "2526",
                "subdomain": "艺术理论",
                "question_type": "选择",
                "answer": "C",
                "parsed_pred": "D"
            }
        },

实际推理这张图片的时候是可以完整输出的,这种原因是为啥?

kcz358 commented 2 months ago

Hi, the max_new_tokens we sets in cmmmu is 16 since it should only output answer choices such as A, B, C, D. If you want to do generation on it, you can set this parameter higher

lucasjinreal commented 2 months ago

Hi, looks like my small models hard to follow the instructing precisely. Am just curious if it possible to edit the prompt in question, such like: 请回答下列选择题,请直接回答选项字母。 or Answer with the option's letter from the given choices directly. just like VLMEvalKit does?

kcz358 commented 2 months ago

For some of the tasks, we have implemented the model_specific_kwargs but sadly this is not included in the cmmmu task.

For now you can try to hardcode your prompt in this function

https://github.com/EvolvingLMMs-Lab/lmms-eval/blob/bf4c78b7e405e2ca29bf76f579371382fec3dd02/lmms_eval/tasks/cmmmu/utils.py#L12-L52

lucasjinreal commented 2 months ago

OK, I changed the prompt into:

 # "task_instructions": [
    #     "请回答以下多项选择题,并选出正确选项。这些题目可能包括单选和多选题型。如果所提供的信息不足以确定一个明确的答案,那么请根据可用的数据和你的判断来选择最可能正确的选项。",
    #     "请回答以下判断题,并根据题目描述和所给的信息来判断问题中陈述的对错。如果信息不完整或不足以作出绝对判断,请运用你的逻辑推理和现有信息来做出最可能的判断。",
    #     "请回答以下填空题,并根据题目的要求和所提供的信息来给出最恰当的答案。如果信息不足以确切回答,那么请依据现有的数据和你的推理能力来填写最合理的答案。",
    # ],
    "task_instructions": [
        "请回答以下多项选择题,并选出正确选项。你只需要回答正确选项对应的字母。可能为单选也可能为多选。",
        "请回答以下判断题,仅需要回答对或者错。",
        "请回答以下填空题,填写空白处正确的内容。",
    ],

It boost my model performance on CMMMU by 2 points....