Open t0saki opened 3 weeks ago
Excellent work! and too many ideas 😁😁 I'm really happy you're interested
CLIPPyX
comand is entry point for main.py
# main.py
import subprocess
import yaml
# Load the configuration file
with open('config.yaml', 'r') as f:
config = yaml.safe_load(f)
subprocess.run(["python", "Index/everything_images.py"])
if config["server_os"] == "wsl":
print("Running in WSL")
subprocess.check_call(["wsl", "-e", "bash", "server_wsl.sh"])
elif config["server_os"] == "windows":
print("Running in Windows")
subprocess.run(["python", "server.py"])
Instead of removing Everything indexing, we can add your updates as option for Unix bases systems. and from config.yaml
you setup yoru server (I may create a GUI for it too)
I occam's razor-ed it an I thought a simple text file updated each time is good option maybe because Everything indexing takes almost 0 time but if there's a reason please tell me
From What I've seen you're creating an index for single directory only (not all images on your OS) which is good functionality if you're interested in single dir (ofc I'm planning to add this option later) but what I mean, you didn't find an alternative for everything to index all images on your disk
can you make a separate PR for it? I Will merge it immediately.
I'm not expert too, but I'm planning to add this as soon as I add decent support for Unix machines
For the monitoring thing I leave everything to Everything (lol) it indexes my files in background and once I run everything_images.py
I get the updated list. We need to search for similar alternative on Unix. and I'm sure we will find similar or at least closer option
Thank you for your amazing ideas
I have submitted a PR to provide a WebUI in Flask https://github.com/0ssamaak0/CLIPPyX/pull/9.
Using SQLite to store the file list is aimed at maintaining high performance when regenerating the list, especially if the file index is very large. Writing out a huge txt file each time could be very time-consuming (although this may not be observed currently). However, this might need to be implemented together with file change monitoring, so as you mentioned, the current implementation might not be necessary.
Using a single directory is intended to eventually distribute this project as a Docker image to run on servers and other devices — in such a case, a directory from the host machine could be bound to the container, and the service would scan this directory. However, this goes against the current purpose of using Everything to index Windows files, so this is limited to my own needs.
I really like the project you provided! I hope this project can become even more perfect.
Thank you very much for your work! I have always wanted to use a more powerful model to replace the built-in AI search in Synology Photos to help me index the pictures on my NAS. This project is very close to my needs.
To get this project running on my NAS, I modified the indexing part of the code to use a less efficient but more general scanning method and stored the file list in an SQLite DB.
Additionally, I added an interface for the frontend
index.html
inserver.py
. Now you can access the search web page via/index
after startingserver.py
.Since I won't be running this project on Windows, I completely removed support for Everything. If you think my modifications can be merged into the mainline, I can restore the indexing logic for the Windows platform and submit a PR so as not to degrade performance on Windows.
If I have the energy later, I might create a Dockerfile (although I am not very proficient at it) and support monitoring file system changes to scan new files in real-time and improve indexing performance. If you already have related plans, please let me know.
I don't have much development experience in this area, so if any of my implementations seem ridiculous, please point them out.
Thank you :)
My development branch: https://github.com/0ssamaak0/CLIPPyX/compare/main...t0saki:CLIPPyX:support-linux?expand=1