neonwatty / meme_search

Index your memes by their content and text, making them easily retrievable for your meme warfare pleasures. Find funny fast.
https://memesearch.co/
Apache License 2.0
288 stars 9 forks source link

get_current_indexed_img_names failed with exception unable to open database file #28

Open ProfesseurIssou opened 1 month ago

ProfesseurIssou commented 1 month ago

Hi, i try to use your docker compose

version: '3.8'

volumes:
  meme_search-config:
    driver_opts:
      type: "nfs"
      o: "addr={{IP}},rw,noatime,rsize=8192,wsize=8192,tcp,timeo=14,nfsvers=4"
      device: ":/volume1/docker-config/meme_search-config"
  syno-image:
    driver_opts:
      type: "nfs"
      o: "addr={{IP}},rw,noatime,rsize=8192,wsize=8192,tcp,timeo=14,nfsvers=4"
      device: ":/volume2/multimedia/image"

services:
  meme_search:
    image: ghcr.io/neonwatty/meme-search:latest
    container_name: meme_search
    ports:
      - 8501:8501
    volumes:
      - meme_search-config:/home/data
      - syno-image:/home/data/input
    ## uncomment to enable GPU support for the container
    # deploy:
    #   resources:
    #     reservations:
    #       devices:
    #         - driver: nvidia
    #           count: 1
    #           capabilities: [gpu]

From the webapp i tried to click on 'refresh index' but i got this error

ValueError: FAILURE: get_current_indexed_img_names failed with exception unable to open database file
Traceback:

File "/usr/local/lib/python3.10/site-packages/streamlit/runtime/scriptrunner/script_runner.py", line 589, in _run_script
    exec(code, module.__dict__)
File "/home/meme_search/app.py", line 33, in <module>
    val = process()
File "/home/meme_search/utilities/create.py", line 8, in process
    old_imgs_to_be_removed, new_imgs_to_be_indexed = get_input_directory_status(img_dir, sqlite_db_path)
File "/home/meme_search/utilities/status.py", line 24, in get_input_directory_status
    current_indexed_names = get_current_indexed_img_names(sqlite_db_path)
File "/home/meme_search/utilities/status.py", line 18, in get_current_indexed_img_names
    raise ValueError(f"FAILURE: get_current_indexed_img_names failed with exception {e}")

I try to execute this command from the container shell

python meme_search/utilities/create.py

But i got the same error

neonwatty commented 1 month ago

HI! This looks like an adjusted version of the compose file from the repo that I'm not familiar with at the moment.

I can't tell, is your meme_search service mounting the local .data directory to /home/data in the container?

Instead of the volume mount you have

    volumes:
      - meme_search-config:/home/data
      - syno-image:/home/data/input

can you try our current mount point

    volumes:
      - ./data:/home/data
ProfesseurIssou commented 1 month ago

I just launch the stack with the default docker-compose and i have this error

FileNotFoundError: [Errno 2] No such file or directory: '/home/data/input/'
Traceback:

File "/usr/local/lib/python3.10/site-packages/streamlit/runtime/scriptrunner/script_runner.py", line 589, in _run_script
    exec(code, module.__dict__)
File "/home/meme_search/app.py", line 33, in <module>
    val = process()
File "/home/meme_search/utilities/create.py", line 8, in process
    old_imgs_to_be_removed, new_imgs_to_be_indexed = get_input_directory_status(img_dir, sqlite_db_path)
File "/home/meme_search/utilities/status.py", line 22, in get_input_directory_status
    all_img_paths = collect_img_paths(img_dir)
File "/home/meme_search/utilities/imgs.py", line 18, in collect_img_paths
    raise e
File "/home/meme_search/utilities/imgs.py", line 10, in collect_img_paths
    all_img_paths = [os.path.join(img_dir, name) for name in os.listdir(img_dir) if name.split(".")[-1] in allowable_extensions]

I already tried to make the folder manually

But my multimedia folder is managed by another app, so i can't put my multimedia in data/input How can i separate the configuration and the mediafolder ?

zahidhanif commented 1 month ago

I had the same issue when using the docker compose. Make sure you have the data folder from repo with the input and dbs folders, including the contents (images and database files). When I did this, the compose file worked.

ProfesseurIssou commented 1 month ago

After cloning, it's working, it is possible for the app to download automatically the needed files ? Or include the default DB in the image ?

neonwatty commented 1 month ago

Baking the default DB into the image was suggested as an entrypoint for the docker image in this pr.

If copying the local data directory was built into the docker build process via the Dockerfile - I feared we would be opening up a can of worms. If one wanted to re-build the docker image - after including their own images / db - their corresponding build could be very large.

My take (at the time at least) was - separate the data from the image.

In the short term - we can make more gracefully fail improvements with files not found issues like

FileNotFoundError: [Errno 2] No such file or directory: '/home/data/input/'

And perhaps migrate to Postgres / pgvector or a sqlite extension that contains vector search - combining the vector and lookup tables in one place. Or one file when it comes to sqlite!