What does `failed` mean in the test?

zhimin-z commented 10 months ago

Coach257 commented 10 months ago

Failed means the MLLM fails to perform the corresponding task, if the evaluation results are far below expectations. For example, in the keypoints detection task, if none of the keypoints identified by the MLLM response are correct, it is labeled Failed. In the facial classification task CelebA(Smile), since the answer range is only 'yes' or 'no', if the accuracy is below 50% which is the accuarcy of random guess, we also consider it a failure.

zhimin-z commented 10 months ago

Failed means the MLLM fails to perform the corresponding task, if the evaluation results are far below expectations. For example, in the keypoints detection task, if none of the keypoints identified by the MLLM response are correct, it is labeled Failed. In the facial classification task CelebA(Smile), since the answer range is only 'yes' or 'no', if the accuracy is below 50% which is the accuarcy of random guess, we also consider it a failure.

Thanks for your explanation. But I still think it is better to have the exact values shown rather than having it uncovered since it might be more informational than a simple "FAILED".

Coach257 commented 10 months ago

Thanks for your suggestion. But the results of LAMM-Benchmark are out-of-date, as we recommand ChEF for the latest benchmark.

OpenGVLab / LAMM

What does `failed` mean in the test? #56