h4nwei / SPAQ

[CVPR2020] Official SPAQ & Implementation
170 stars 35 forks source link

Score Error of the pretrain model, which I doubt the experiments results. #20

Closed youyuge34 closed 2 years ago

youyuge34 commented 2 years ago

I use the same version of your requirements. The output score must have some error, for instance the picture 00006.jpg in the dataset whose scores are 00006

MOS: 33.17
Bright : 23.83
Colorful: 48
Contrast: 46
Noise: 63
Sharpness: 47

When using the pretrained weights of BL-demo, the output score is 54.924 and using the pretrained weights of MT_A, the outputs are none of sense:

54.965328216552734
1 -->  60.083614349365234
2 -->  61.15263366699219
3 -->  51.99977111816406
4 -->  64.30211639404297
5 -->  57.21396255493164

Due to the L1 loss and the powerful performance of resnet, I wonder why it happens? I strongly doubt the reliability and accuracy of the experiments in the paper. PLZ solve my doubt. This problem was mentioned in #3 where you can't explain the reason.

h4nwei commented 2 years ago

@youyuge34 Thanks for you interest in our work. As mentioned in #3, the proposed BIQA models pay more attention on the correlation with subjective scores. As a result, the output quality scores may not be as good as you hoped. We do not think there are any problems with the experimental results because they have been reproduced multiple times.