Closed Ji4chenLi closed 1 month ago
Hi, thank you for your interest in our work. Simply package the JSON file generated by VBench into a zip file and upload it directly.
Hi Yinan,
Thank you for your response. I tried your suggestion, but I'm still unable to upload the .zip file. Specifically, I zip all *eval_results.json into a .zip file before uploading it.
Can you look into the issue?
Hi Yinan,
Thank you for your response. I tried your suggestion, but I'm still unable to upload the .zip file. Specifically, I zip all *eval_results.json into a .zip file before uploading it.
Can you look into the issue?
I apologize, we have just inspected the server and found that there was an issue with our processing logic. The problem has been fixed now, thank you for your feedback! Please note that you need to upload a zip file, and the first-level directory inside the zip should contain all the *result.json files. Do not modify the JSON content outputted by vbench. Don't worry about extra files in the zip, they will not be counted in the Leaderboard.
I just had another try, but I still failed to upload any files to the leaderboard. Could you take another look?
I just had another try, but I still failed to upload any files to the leaderboard. Could you take another look?
@Ji4chenLi I noticed that T2V-Turbo (VC2) has been successfully submitted to the leaderboard, but only "scene" and "color" have been submitted, is that correct?
The submission is actually not expected. I submitted one or two JSON files for debugging purposes but still failed to submit the entire zip file. If you have time, could you jump into a quick chat and fix the bugs? Or I can directly send you my zip file.
@ziqihuangg has shared your zip file in the email. I have made the necessary code adjustments to accommodate this situation. Could you please try again?
Thank you, Yinan! My submission seems successful, but the Total Score, Quality Score, and Semantic Score are different from my calculation. I will look into it.
I might have found the bugs. The leaderboard somehow flips scores of aesthetic quality
and dynamic degree
of my model. As the dynamic degree uses a different weight (0.5) to calculate the Quality Score
, my results end up to be lower than expected.
Could you please help me fix that? My model T2V-Turbo (VC2) should aesthetic quality = 63.04 and dynamic degree = 49.17.
Hi @Ji4chenLi , our calculation of total scores applies per-dimension normalisation and weighting. These constant parameters can be found here: https://huggingface.co/spaces/Vchitect/VBench_Leaderboard/blob/main/constants.py
Hi Ziqi,
I understand it. I have carefully followed your codes to do the calculation. The thing is that currently the aesthetic quality
and dynamic degree
of my model get flipped on the leaderboard, leading to worse Quality Score
and Total Score
of my model T2V-Turbo (VC2).
We've swapped the values of these two dimensions. Could you help check again, whether it's consistent with your results? Thanks!
It resolves my issue! Thank you!
Hi,
Thank you so much for your efforts in putting together the comprehensive benchmarks!
Could you provide detailed instructions for submitting the evaluation results? I obtained 16
*eval_results.json
after evaluating through all the dimensions. But it seems that I cannot submit these individualjson
file to the leaderboard.Thanks, Jiachen