Open rebuttalpapers opened 6 months ago
Hi, thanks for your question! The score range for all dimensions are 0 to 1.
For samples of different scores, at different dimensions, you can refer to section G of supplementary materials: https://arxiv.org/pdf/2311.17982, where for each dimension we provided some samples at varying scores.
Thanks Ziqi for your helpful answer. May I ask why we have dynamic degree as false, while subject consistency & imaging_quality much larger than 1?
Hi, I assume you are asking about the scores for individual videos in the generated eval_results.json
file.
For dynamic_degree
, each video undergoes binary classification, with true
referring to dynamic
, while false
referring to static
. The final score for the dynamic_degree
dimension is defined as the percentage of videos classified as dynamic
.
For other dimensions that have values larger than 1, it could be due to these two reasons: (1) The individual videos' raw score is in the range of 0-100. (2) The individual video's raw score hasn't been divided by the frame count yet.
We retain these raw scores for individual videos in case users need them for debugging. However, you should refer to the final aggregated score for each dimension to assess the model's performance in that particular dimension.
Hi, I assume you are asking about the scores for individual videos in the generated
eval_results.json
file.For
dynamic_degree
, each video undergoes binary classification, withtrue
referring todynamic
, whilefalse
referring tostatic
. The final score for thedynamic_degree
dimension is defined as the percentage of videos classified asdynamic
.For other dimensions that have values larger than 1, it could be due to these two reasons: (1) The individual videos' raw score is in the range of 0-100. (2) The individual video's raw score hasn't been divided by the frame count yet.
We retain these raw scores for individual videos in case users need them for debugging. However, you should refer to the final aggregated score for each dimension to assess the model's performance in that particular dimension.
how can I get the original classification probability of dynamic_degree?
Hi, it's not probability-based classification, but based on threshold.
Hi, it's not probability-based classification, but based on threshold.
Hello! Sorry, but I still did not get how to get a numeric value instead of boolean? Boolean seems not to be useful for my evaluations
Currently Vbench can evaluate on the list of dimension ['subject_consistency', 'background_consistency', 'temporal_flickering', 'motion_smoothness', 'dynamic_degree', 'aesthetic_quality', 'imaging_quality', 'object_class', 'multiple_objects', 'human_action', 'color', 'spatial_relationship', 'scene', 'temporal_style', 'appearance_style', 'overall_consistency']
I run some of them on my customized video, however, the score range for each of the dimension is different.
For one of my videos, It has 5 scores subject consistency: 10.982122957706451 motion_smoothness: 0.9960492387493192 dynamic degree: false aesthetic_quality: 0.6582092642784119 imaging_quality: 72.89873886108398
while the overall score is as following subject consistency: 0.9861730885776606 motion_smoothness: 0.9909714810295909 dynamic degree: 0.16666666666666666 aesthetic_quality: 0.6556713245809078 imaging_quality: 0.7093512528141342
Can you illustrate more on the score range for each video and what does it mean? It would be better to generate some example for score anchor for each dimension. For example, if the range of aesthetic_quality is 0-1. I would like to know how 0.1, 0.5 and 0.9 look like separately. Thanks!