jarikorhonen / hriq

High Resolution Image Quality (HRIQ) database and model
11 stars 2 forks source link

Categorical Bias? #1

Open Luke2642 opened 6 months ago

Luke2642 commented 6 months ago

Hi, great work collating this dataset, thank you.

I'm a bit confused thought. I checked the top 15 by MOS100 and they were mostly landscape shots of architecture on a blue-sky day, and two with horses.

Do you have any ideas of how to end up with a more balanced score across multiple categories? I'm thinking maybe start with 100 to 1000 excellent images well distributed across a range of categories and compositions, and then crop, rotate, corrupt, distort and blur them with combinations of techniques to target particular human-perceptual quality scores or categories, and then train the network to output those scores?

What do you think? Would you be interested in collaboration? :-) We could also focus on replicating specific photographic aberations like this slide from AutoDIR!

image

jarikorhonen commented 6 months ago

Hi Luke, thank you.

There is at least some diversity, since the top rated photos represent both natural landscapes and architecture.

However, I acknowledge the problem in general. The users tend to give higher ratings to daytime photos than low light photos. This is somehow expected, because low light photos are more prone to shakiness and sensor noise. In high resolution, sensor noise is even more visible than in low resolution. In this respect, the dataset probably represents the reality reasonably well, since low light photos indeed tend to exhibit lower quality than daytime photos.

We have tried to select some sharp and colorful night time photos in the dataset, e.g., 77.jpg, 1027.jpg, 1037.jpg, 1038.jpg, 1050.jpg. Unfortunately, none of them made it to the top-10, but they still got reasonably high ratings.

There is of course no easy way to solve the problem without running new subjective experiments including night time photos with higher quality. Of course it could be possible to reduce the bias to some extent by removing some of the high quality daytime photos to make a subset of the dataset with e.g. 1000 images. Not sure if it would make a big difference, though.

Kind regards, Jari

On Sat, Mar 23, 2024 at 1:36 PM Luke Perkin @.***> wrote:

Hi, great work collating this dataset, thank you.

I checked the top 10 by MOS and they seemed to be landscape shots of architecture on a blue-sky day, and one of horses in a field.

Do you have any ideas of how to end up with a more balanced score across multiple categories?

— Reply to this email directly, view it on GitHub https://github.com/jarikorhonen/hriq/issues/1, or unsubscribe https://github.com/notifications/unsubscribe-auth/AE36LKZV56Y76KDCSX2WW5TYZWAONAVCNFSM6AAAAABFEVHGHCVHI2DSMVQWIX3LMV43ASLTON2WKOZSGIYDGOBYGMZTQMY . You are receiving this because you are subscribed to this thread.Message ID: @.***>

Luke2642 commented 6 months ago

Thanks for your reply. I'll have a good think about how to approach it, I realise it's not gonna be easy! Good quality night photography is tricky edge case.

image

jarikorhonen commented 6 months ago

Hi, I just noticed you had edited your original posting. I think the research community is somehow sceptic about replicating natural distortions on pristine photos artificially, since it may look the same for humans, but not to the AI. It is also good to note that there are quite a lot of image quality datasets with artificially generated distortions already. It could be somethig worth exploring anyways, I will think about it.