Vchitect / VBench

[CVPR2024 Highlight] VBench - We Evaluate Video Generation
https://vchitect.github.io/VBench-project/
Apache License 2.0
506 stars 26 forks source link

Where is the human preference data? #22

Open linzhiqiu opened 5 months ago

ziqihuangg commented 5 months ago

It's provided here: https://drive.google.com/drive/folders/1jYAybu2BazShGV-DLityFi4j7BjTE-my?usp=sharing

ziqihuangg commented 5 months ago

More details supplied here: https://github.com/Vchitect/VBench/tree/master/sampled_videos#human-preference-labels You can also preview the data on huggingface: https://huggingface.co/datasets/Vchitect/VBench_sampled_video

weixi-feng commented 5 months ago

Hi, Could you please provide more details about how Spearman's correlation coefficient is computed for each dimension? Thank you.

ziqihuangg commented 4 months ago

Hi, we conduct pair-wise comparisons of two sampled videos, using both VBench and Human Preference Annotations. We then compute the spearman correlation of the win ratios on all pairs drawn from the VBench prompt suite, between VBench results and Human results.

For more details on human preference data, and the computation, please refer to our paper (1) Section 3.3. - Human Preference annotation, (2) Section 4.2 - Validating Human Alignment of VBench, and (3) Supplementary material's Section I - Human Preference Annotation. https://arxiv.org/abs/2311.17982 Thanks!

Chenzhou2344 commented 3 months ago

Hi, we conduct pair-wise comparisons of two sampled videos, using both VBench and Human Preference Annotations. We then compute the spearman correlation of the win ratios on all pairs drawn from the VBench prompt suite, between VBench results and Human results.

For more details on human preference data, and the computation, please refer to our paper (1) Section 3.3. - Human Preference annotation, (2) Section 4.2 - Validating Human Alignment of VBench, and (3) Supplementary material's Section I - Human Preference Annotation. https://arxiv.org/abs/2311.17982 Thanks!

Hi, Can I understand it as calculating the Win ratio for 5 x 4 videos of each prompt and calculating the Spearman correlation based on these win ratios, and then taking the average across the entire prompt suite?