[CVPRW 2022] Attentions Help CNNs See Better: Attention-based Hybrid Image Quality Assessment Network
Evaluating AHIQ on traditional IQA datasets #3

Question for the results in the original paper: How is the evaluation on traditional datasets (LIVE, CSIQ, TID) performed? Do you report average performance over K runs? The paper only mentions that datasets are split 60-20-20 train/val/test. Please add a more detailed description.

I have the same question, especially when I test the whole TID2013 dataset with the code 'test.py'. The results are totally unsatisfactory and the PLCC, and SRCC scores are very low. Can anyone teach me how to correctly test the results?