Great work!
I am wondering whether the woodpecker can stably raise the MME scores since I found that some sub-tasks (e.g., position, color) did not perform well after the correction.
Details:
Following the paper, I've changed the MLLM model output to the format of <yes/no + question (like 'there is xx').>
Is there anything I've missed that caused inferior results?
Hi there,
Great work! I am wondering whether the woodpecker can stably raise the MME scores since I found that some sub-tasks (e.g., position, color) did not perform well after the correction.
Details: Following the paper, I've changed the MLLM model output to the format of <yes/no + question (like 'there is xx').>
Is there anything I've missed that caused inferior results?
Thanks a lot!