Closed zhimin-z closed 10 months ago
Failed
means the MLLM fails to perform the corresponding task, if the evaluation results are far below expectations. For example, in the keypoints detection task, if none of the keypoints identified by the MLLM response are correct, it is labeled Failed
. In the facial classification task CelebA(Smile), since the answer range is only 'yes' or 'no', if the accuracy is below 50% which is the accuarcy of random guess, we also consider it a failure.
Failed
means the MLLM fails to perform the corresponding task, if the evaluation results are far below expectations. For example, in the keypoints detection task, if none of the keypoints identified by the MLLM response are correct, it is labeledFailed
. In the facial classification task CelebA(Smile), since the answer range is only 'yes' or 'no', if the accuracy is below 50% which is the accuarcy of random guess, we also consider it a failure.
Thanks for your explanation. But I still think it is better to have the exact values shown rather than having it uncovered since it might be more informational than a simple "FAILED".
Thanks for your suggestion. But the results of LAMM-Benchmark are out-of-date, as we recommand ChEF for the latest benchmark.