Closed bobby-chiu closed 6 years ago
uploading test images are blocked by our network. i will look for another way soon.
There are images that the Ava dataset has never seen before. In such a case, the results are pretty flaky sadly. I assume with sufficient training with enough data from external sources trained in a similar manner, the results will be better.
I suggest using NASNet models. It has the best performance. Also, scores below 4 is considered bad and above 6 are good due to how AVA is distributed and how it was trained.
hi, titu1994, AVA dataset is all most Aesthetic image, and use NASNet model can use distorted image quality assessment ? thank you very mach!
thanks, i got your point. I have another doubt that how to train model if my images are high resolution (e.g., 3968x2976) with local distorted regions (e.g., local blur, local abnormal block). In this case, to resize image into 224x224 may suppress or even remove these abnormal regions which are my concerns.
hi, bobby-chiu,I have you same doubt. I test one blur distorted image (224x224 resolution) to test using NASNet, it may be still get high scale. May be must re-train NASNet model with other dataset(vs, TID2008, TID2013 or LIVE), but I have no ideal.
@stanfordone, retrain TID2013 (small resolution with global distorted error) has been implemented with similar manner by NIMA. One more step needs to be done is to estimate the score distribution with maximum entropy optimization, as mentioned by paper. But retrain model with high resolution images is still unknown for me.
You could take 224x224 crops randomly out of the large image, but without knowing which part gets which score, it may give you wrong training results.
I noticed that naturally blurred pictures got bad scores, but then typically also other properties (lightning, composition,...) were bad. I'm not sure if it's the correct approach to artificially blur whole pictures, since these kind of images will not appear in most use cases. Of course you could train on TID2013, but then you get a different assessment. What's your use case exactly?
I've also thought, that by resizing the images you will lose the ability to detect certain defects you can only detect when looking really close. But my intuition says, this will not matter in many cases. Still, if you would like to improve here, you could for example train on multiple parts of the images, scaled in a various resolution. I.e. one 224x224 cropped region as titu1994 suggested, but maybe not selected randomly but more at an interesting region (which you could find with some visual attention model).
Or you can modify the network to accept images of arbitrary size. See this paper (Composition-preserving Deep Photo Aesthetics Assessment). They get some nice improvements by having an ensemble of multiple networks which work on the image in different resolutions. But of course you will get improvements by ensembles in any case, even if they would work on the same resolution. On the other hand this will take very long to train (probably infeasible for me) and might also be too slow at the inference stage.
@bobby-chiu I want to retain with TID2013 datatset. But you know that TID2013 is not same AVA dataset. So Is there anything I can refer to for the one step(maximum entropy optimization) you mentioned above? Thank you in advance
@threeon1318 As the mean value and std dev value have given, normal distribution is the distribution which has the maximum entropy
@threeon1318 have you figure out the "maximum entropy optimization" step?
@threeon1318 I've found that another repo author have implemented the "maximum entropy optimization" step, click here, you can find the tid2013 label json file in data/TID2013/
, that's all we need.
test two images (one for good quality and another distorted image) from the three trained models, i got scores as below: Evaluating_inception_resnet : D:\test\bad.PNG NIMA Score : 5.215 +- (1.719)
Evaluating_inception_resnet : D:\test\good.PNG NIMA Score : 5.694 +- (1.815)
Evaluating_mobilenet : D:\test\bad.PNG NIMA Score : 5.132 +- (1.723)
Evaluating_mobilenet : D:\test\good.PNG NIMA Score : 5.653 +- (1.651)
Evaluating_nasnet : D:\test\bad.PNG NIMA Score : 4.879 +- (1.759)
Evaluating_nasnet : D:\test\good.PNG NIMA Score : 6.736 +- (1.593)
it makes me confused why the distored image still got high score?