What were the 20 questions?

JD-P / simulacra-aesthetic-captions

Dataset of prompts, synthetic AI generated images, and aesthetic ratings.

391 stars 18 forks source link

What were the 20 questions? #6

Open genolve opened 1 year ago

genolve commented 1 year ago

I'm new to sqlite so may be missing something. A dump of the tables gives: [('survey',), ('generations',), ('images',), ('sqlite_sequence',), ('paths',), ('ratings',), ('upscales',)] I look in ratings and find: ['sid', 'iid', 'rating', 'verified'] I match the sid to the id in the survey table which has: ['id', 'qid', 'rating'] I've sampled each table and do not see the questions that correspond to the qid in the survey table.

ljy0ustc commented 1 year ago

Hi, I have the same question with you. Have you figured out what are the columns mean for each table, especially for "ratings" (sid, iid, rating, verified)?

genolve commented 1 year ago

I believe the Survey table qid are not questions but these 20 images: https://github.com/JD-P/simulacra-aesthetic-captions#simulacra-aesthetic-survey Used to detect user bias, e.g. if you were concerned with racial bias you'd weed out users who give a low rating to pictures 1 and 5. As for the Ratings table, this is how I implemented it in my Pytorch reader near the bottom (SAC dataset) of this notebook: https://github.com/genolve/NIMA-pytorch-aesthetic-critic

ljy0ustc commented 1 year ago

Thanks a lot for answering! In my understanding, "ratings.sid" represents the user who scores the image. Moreover, I'm wondering that how to explain the same ("ratings.sid", "ratings.iid") have various "ratings.rating"s.