TagStudioDev / TagStudio

A User-Focused Photo & File Management System
https://docs.tagstud.io/
GNU General Public License v3.0
5.31k stars 375 forks source link

[Bug]: _match_missing_file walks library for every missing file #610

Open Toby222 opened 4 days ago

Toby222 commented 4 days ago

Checklist

TagStudio Version

Alpha 9.4.0+

Operating System & Version

NixOS unstable

Description

the Search & Relink feature for missing files is incredibly slow for larger libraries, since for every missing entry it walks the library directory again. Extremely noticeable on larger library.

(I noticed this when reorganizing my photos folder from YYYY/ folders to YYYY/MM/DD/ folders, so there were ~11000 moved files)

Expected Behavior

The list of files in the library should be cached between calls to _match_missing_file to speed up unlinked entry Relinking. In a bodge solution I added a simple member _cache to the Library class, and in _match_missing_file initialize it with the result of os.walk, if not already set.

Steps to Reproduce

  1. Create a library with a lot of entries
  2. Move multiple files to a new directory
  3. Search & Relink the now-missing entries
  4. Observe that between every entry a second or more can pass

Logs

No response

CyanVoxel commented 4 days ago

I would call this more of a request/need for optimization than a bug since the code is working as intended, but I do agree that this area of the code could really use some optimizing.

Toby222 commented 4 days ago

Between feature request and bug, I felt like "doesn't work as it should" fit better than "doesn't work like how I want" There's no clear category for where to put performance issues over stuff that actually breaks something :^)