0ssamaak0 / CLIPPyX

AI Powered Image search tool offers content-based, text, and visual similarity system-wide search.
MIT License
110 stars 10 forks source link

Find Unix (Linux / macOS) Alternative for Everything #5

Open 0ssamaak0 opened 4 weeks ago

0ssamaak0 commented 4 weeks ago

Everything

we need similar functionality (with any kind of api to call) to get paths of all images

MahmoudAshraf97 commented 4 weeks ago

I guess using os.walk and glob would be sufficient and portable, but you still need to prepare patterns to match all image extensions and find a way to monitor changes

0ssamaak0 commented 4 weeks ago

I guess using os.walk and glob would be sufficient and portable, but you still need to prepare patterns to match all image extensions and find a way to monitor changes

It should work I guess, and the pattern is not the problem The main problem it would take too much time. Everything already handles indexing, monitoring changes and updating everything.

MahmoudAshraf97 commented 4 weeks ago

check this: https://github.com/cboxdoerfer/fsearch although I'm in favor of building a more simple, os-independent pipeline

0ssamaak0 commented 4 weeks ago

check this: https://github.com/cboxdoerfer/fsearch although I'm in favor of building a more simple, os-independent pipeline

ofc me too, the only os dependent part is Everything SDK.

I tried os.scandir and it got all images in 3:30 minutes, not bad if it's done once. (not each time the server starts) and think about how to monitor changes. any ideas?

I'm thinking of adding this as basic setup (like HF transformers in models) and if anyone wants better options there's Everything or equivalent tools.

I'm really grateful for your help and your ideas ❤️❤️

MahmoudAshraf97 commented 4 weeks ago

check this: https://www.geeksforgeeks.org/create-a-watchdog-in-python-to-look-for-filesystem-changes/ also I found that using the native search functionality for each system is very fast so if you are willing to compromise, maybe use it

import subprocess

def find_files(directory, extensions):
    cmd = f"find {directory} -type f \\( -iname '*.{extensions[0]}'"
    for ext in extensions[1:]:
        cmd += f" -o -iname '*.{ext}'"
    cmd += " \\)"
    result = subprocess.run(cmd, shell=True, capture_output=True, text=True)
    return result.stdout.splitlines()

files = find_files('mnt', ['jpg', 'png'])

great work btw

0ssamaak0 commented 4 weeks ago

Amazing! I haven't known this already exists lol. I will give it a try 😉😉