deepfates / memery

Search over large image datasets with natural language and computer vision!
https://deepfates.com/memery
MIT License
526 stars 27 forks source link

Corrupt files no longer cause reindexing #25

Closed wkrettek closed 2 years ago

wkrettek commented 2 years ago

I made a wrapper for the get_image_files function so it only returns valid images when checking whether the reindex. I also changed the list of valid image suffixes to a set because (I think) that should be more efficient when using the 'in' keyword.

In the future I think it may be better to have a refresh button on the UI that lets the user manually decide whether to check for new files but for now it still automatically checks with every query. I didn't make any new tests or documentation for this, so let me know if anything should be changed!

wkrettek commented 2 years ago

It may be easier to just combine the two functions so get_image_files always returns valid images but I haven't looked through all the code to see where that func is called so idk if there was ever a point you wanted to return both valid and corrupt images.

deepfates commented 2 years ago

Okay, tested this out and it seems solid. Tried putting the verify_image logic in the get_image_files list comprehension and it 10xed search time, so let's keep it your way for now.

There's a lot of other changes that need to be made to this repo still but this one doesn't hurt anything. Merging