fire-eggs / Danbooru2021

Python scripts and tools for working with the Danbooru2022 data set. Note: this is a sqlite database and a viewer, not directly related to machine learning.
https://www.gwern.net/Danbooru2021
MIT License
42 stars 2 forks source link

FilterView performance is slow #58

Open fire-eggs opened 2 months ago

fire-eggs commented 2 months ago

The query when using FilterView is not performant. Mainly because the use of like statements requires scanning the tags table (no useful index possible).

E.g. the following query:

select image_id from images where is_deleted=0 and is_banned=0 and image_id in 
(select image_id from imageTags where tag_id in (select tag_id from tags where name = '%girl%')) order by image_id

takes over 22 seconds to execute (returning 6,260,963 rows).