fire-eggs / Danbooru2021

Python scripts and tools for working with the Danbooru2022 data set. Note: this is a sqlite database and a viewer, not directly related to machine learning.
https://www.gwern.net/Danbooru2021
MIT License
42 stars 2 forks source link

Check for missing images #15

Closed fire-eggs closed 4 years ago

fire-eggs commented 4 years ago

The image 3367946 is missing from the file set. Are there any other files missing?

fire-eggs commented 4 years ago

[(3361629,), (3361635,), (3361710,), (3361762,), (3361833,), (3361837,), (3361845,), (3361846,), (3361847,), (3361941,), (3361951,), (3362088,), (3362140,), (3362238,), (3362274,), (3362278,), (3362292,), (3362315,), (3362349,), (3362369,), (3362387,), (3362433,), (3362453,), (3362466,), (3362485,), (3362489,), (3362514,), (3362515,), (3362589,), (3362684,), (3362686,), (3363089,), (3363621,), (3363652,), (3364010,), (3364083,), (3364180,), (3364181,), (3364282,), (3364303,), (3364628,), (3364741,), (3364743,), (3364746,), (3364751,), (3364797,), (3364930,), (3364936,), (3364972,), (3364985,), (3364987,), (3364989,), (3365016,), (3365026,), (3365067,), (3365192,), (3365200,), (3365201,), (3365203,), (3365241,), (3365338,), (3365339,), (3365341,), (3365343,), (3365365,), (3365369,), (3365523,), (3365552,), (3365762,), (3365763,), (3365764,), (3365765,), (3365766,), (3365767,), (3365769,), (3365772,), (3365773,), (3365774,), (3365776,), (3365777,), (3365778,), (3365780,), (3365781,), (3365782,), (3365804,), (3365806,), (3365816,), (3365817,), (3365818,), (3365823,), (3365873,), (3365920,), (3365921,), (3366168,), (3366187,), (3366260,), (3366292,), (3366314,), (3366343,), (3366505,), (3366783,), (3366793,), (3366804,), (3366821,), (3366886,), (3367037,), (3367144,), (3367165,), (3367166,), (3367186,), (3367201,), (3367212,), (3367278,), (3367279,), (3367281,), (3367384,), (3367389,), (3367447,), (3367700,), (3367756,), (3367764,), (3367800,), (3367816,), (3367821,), (3367825,), (3367829,), (3367833,), (3367920,), (3367969,), (3367973,), (3367982,), (3367997,), (3368016,), (3368024,), (3368030,), (3368118,), (3368193,), (3368199,), (3368200,), (3368343,), (3368448,), (3368635,)]

fire-eggs commented 4 years ago

Some of the above require a "gold account" to access.

fire-eggs commented 4 years ago

Write a script to scan for missing files.

fire-eggs commented 4 years ago

2569 files identified as "missing" : included in the metadata but no image in the fileset. Marked as "user_delete = 2" in the database.