B-urb / doclytics

A document analyzer for paperless-ngx using ollama
MIT License
68 stars 4 forks source link

All in one Docker Compose ? #77

Open nodecentral opened 1 month ago

nodecentral commented 1 month ago

Hi

Perhaps a long shot, but seeing as my paperless-ngx docker compose already has a number of additional applications - like Tika, Gothenburg, etc. is there any chance a single ‘all in one’ docker compose could be made available that adds all the doclytics, ollama etc. elements altogether too?

B-urb commented 1 month ago

HI @nodecentral, is your idea to have a docker-compose for everything including paperless-ngx and tika/gotenburg or just for doclytics and ollama together ?

nodecentral commented 1 month ago

Hi @B-urb, yes absolutely, it would be amazing to have an all in one application/solution compose.yml.

And in addition to that recommendations on the specification of machine to run it all on (based on the amount of docs etc.) to get the best possible experience. That would be the icing on the cake.. 🤩

nodecentral commented 1 month ago

I'm not great at docker, but to bring everything together into one compose that anyone could then run, I'm thinking it could be something like this (untested)..

version: '3.8'
services:
  ollama:
    image: ollama/ollama:latest
    container_name: ollama
    volumes:
      - /share/Container/ollama:/root/.ollama
    ports:
       - "11434:11434"
    pull_policy: always
    tty: true
    restart: unless-stopped

  doclytics:
    image: bjoern5urban/doclytics:latest
    environment:
      PAPERLESS_BASE_URL: http://127.0.0.1:8777
      PAPERLESS_TOKEN: yourapitoken
    volumes:
      - /share/Container/doclytics/data:/app/data

  open-webui:
    image: ghcr.io/open-webui/open-webui:latest
    container_name: open-webui
    volumes:
      - /share/Container/ollama/webui:/app/backend/data
    depends_on:
      - ollama
    ports:
      - "8282:8080"
    environment:
      - 'OLLAMA_BASE_URL=http://ollama:11434'
      - 'WEBUI_SECRET_KEY='
    extra_hosts:
      - host.docker.internal:host-gateway
    restart: unless-stopped

  redis:
    image: redis:7
    container_name: paperless-redis
    restart: unless-stopped
    volumes:
      - /share/Container/paperlessredis:/data

  db:
    image: postgres:14
    container_name: paperless-db
    restart: unless-stopped
    volumes:
      - /share/Container/paperlessdb:/var/lib/postgresql/data
    environment:
      POSTGRES_DB: paperless
      POSTGRES_USER: paperless
      POSTGRES_PASSWORD: paperless

  webserver:
    image: ghcr.io/paperless-ngx/paperless-ngx:latest
    container_name: paperlessngx
    restart: unless-stopped
    privileged: true
    depends_on:
      - db
      - redis
      - gotenberg
      - tika
    ports:
      - 8777:8000
    healthcheck:
      test: ["CMD", "curl", "-fs", "-S", "--max-time", "2", "http://localhost:8000"]
      interval: 30s
      timeout: 10s
      retries: 5
    volumes:
      - /share/Container/paperless/data:/usr/src/paperless/data
      - /share/Container/paperless/media:/usr/src/paperless/media
      - /share/Container/paperless/export:/usr/src/paperless/export
      - /share/Container/paperless/consume:/usr/src/paperless/consume
      - /share/Container/paperless/scripts:/usr/src/paperless/scripts
      - /share/Container/paperless/trash:/usr/src/paperless/trash
    environment: 
      PAPERLESS_REDIS: redis://redis:6379
      PAPERLESS_DBHOST: db
      USERMAP_UID: 1005
      USERMAP_GID: 1000
      PAPERLESS_TIME_ZONE: Europe/London
      PAPERLESS_ADMIN_USER: username
      PAPERLESS_ADMIN_PASSWORD: password
      PAPERLESS_CONSUMER_RECURSIVE: true
      PAPERLESS_CONSUMER_SUBDIRS_AS_TAGS: true
      # PAPERLESS_CONSUMER_POLLING: 5
      PAPERLESS_OCR_LANGUAGE: eng
      PAPERLESS_TIKA_ENABLED: 1
      PAPERLESS_TIKA_GOTENBERG_ENDPOINT: http://gotenberg:3000
      PAPERLESS_TIKA_ENDPOINT: http://tika:9998
      PAPERLESS_TRASH_DIR: /usr/src/paperless/trash/
      PAPERLESS_CONSUMER_DELETE_DUPLICATES: true
      PAPERLESS_CONSUMER_ENABLE_BARCODES: true
      PAPERLESS_CONSUMER_IGNORE_PATTERNS: '[".DS_STORE/*", "._*", ".stfolder/*", ".stversions/*", ".localized/*", ".@__thumb/*", "desktop.ini"]'

  gotenberg:
    image: docker.io/gotenberg/gotenberg:7.8
    restart: unless-stopped
    container_name: gotenberg
    ports:
       - 3000:3000
    command:
      - "gotenberg"
      - "--chromium-disable-routes=true"

  tika:
    image: ghcr.io/paperless-ngx/tika
    container_name: tika
    ports:
      - 9998:9998
    restart: always