fire-eggs / Danbooru2021

Python scripts and tools for working with the Danbooru2022 data set. Note: this is a sqlite database and a viewer, not directly related to machine learning.
https://www.gwern.net/Danbooru2021
MIT License
42 stars 2 forks source link

How should image (parent / child) relationships be handled? #8

Open fire-eggs opened 4 years ago

fire-eggs commented 4 years ago
  1. Duplicate image: image has parent_id value, but "parent" does not have the "has_children" flag set.
  2. Child: image has parent_id value, and "parent" does have the "has_children" flag set.
fire-eggs commented 4 years ago

For part 1, use a script to mark appropriate images as 'duplicate' in the database.