pabloromeo / clusterplex

ClusterPlex is an extended version of Plex, which supports distributed Workers across a cluster to handle transcoding requests.
MIT License
470 stars 36 forks source link

Docker Compose failed to read dockerfile #242

Closed treverehrfurth closed 1 year ago

treverehrfurth commented 1 year ago

Describe the bug I have been at this for a month now trying to get this all to work and met with problem after problem. This is probably not an issue with the actual code/repo but with my setup. I copied the docker-compose.yaml file and adjusted to my setup but when I try to run docker compose, i'm met with this error:

failed to solve: failed to read dockerfile: open /var/lib/docker/tmp/buildkit-mount3198982547/Dockerfile: no such file or directory

Here is my docker-compse.yaml file:

version: '3.8'

services:
  plex:
    container_name: plex
    build:
      context: ./pms
      dockerfile: ./extended-image/Dockerfile-development
    environment:
      VERSION: docker
      PUID: 1000
      PGID: 1000
      TZ: America/Chicago
      ORCHESTRATOR_URL: http://plex-orchestrator:3500
      PMS_SERVICE: plex     # This service. If you disable Local Relay then you must use PMS_IP instead
      PMS_PORT: "50000"
      TRANSCODE_OPERATING_MODE: both #(local|remote|both)
      TRANSCODER_VERBOSE: "1"   # 1=verbose, 0=silent
      LOCAL_RELAY_ENABLED: "1"
      LOCAL_RELAY_PORT: "32499"
    healthcheck:
      test: curl -fsS http://localhost:32400/identity > /dev/null || exit 1
      interval: 15s
      timeout: 15s
      retries: 5
      start_period: 30s
    volumes:
      - /srv/dockerdata/plex/config:/config
      - /mnt/share/treshare/transcode/plex:/transcode
      - /mnt/share/treshare/Torrents/Completed/TV:/data/tv
      - /mnt/share/treshare/Torrents/Completed/Movies:/data/movies
    ports:
      - 32499:32499   # LOCAL_RELAY_PORT
      - 50000:32400
      - 3005:3005
      - 8324:8324
      - 1900:1900/udp
      - 32410:32410/udp
      - 32412:32412/udp
      - 32413:32413/udp
      - 32414:32414/udp

  plex-orchestrator:
    container_name: plex-orchestrator
    build: ./orchestrator
    healthcheck:
      test: curl -fsS http://localhost:3500/health > /dev/null || exit 1
      interval: 15s
      timeout: 15s
      retries: 5
      start_period: 30s
    environment:
      TZ: America/Chicago
      LISTENING_PORT: 3500
      WORKER_SELECTION_STRATEGY: "LOAD_RANK" # RR | LOAD_CPU | LOAD_TASKS | LOAD_RANK (default)
    volumes:
      - /etc/localtime:/etc/localtime:ro
    ports:
      - 3500:3500

  plex-worker:
    build:
      context: ./worker
      dockerfile: ./extended-image/Dockerfile-development
    deploy:
      mode: replicated
      replicas: 1
    environment:
      VERSION: docker
      PUID: 1000
      PGID: 1000
      TZ: America/Chicago
      LISTENING_PORT: 3501      # used by the healthcheck
      STAT_CPU_INTERVAL: 2000   # interval for reporting worker load metrics
      ORCHESTRATOR_URL: http://plex-orchestrator:3500
      EAE_SUPPORT: "1"
    healthcheck:
      test: curl -fsS http://localhost:3501/health > /dev/null || exit 1
      interval: 15s
      timeout: 15s
      retries: 5
      start_period: 240s
    volumes:
      - /srv/dockerdata/plex/codecs:/codecs
      - /mnt/share/treshare/Torrents/Completed/TV:/data/tv
      - /mnt/share/treshare/Torrents/Completed/Movies:/data/movies
      - /mnt/share/treshare/transcode/plex:/transcode
volumes:
  plex-config:
  transcode-volume:
  codecs:

Here are the directories I have under /srv/dockerdata/plex: image

To Reproduce Steps to reproduce the behavior:

  1. Create a docker-compose.yaml file with code above
  2. try to run `docker compuse up' on it

Expected behavior Should build and run all the services

Desktop (please complete the following information):

pabloromeo commented 1 year ago

Hard to say what's going on, but those directories don't look correct, per the repo. There's no extended-image directory at the root of the repo, for example. You'll probably also not want to store things such as codecs or config within your code repo, but rather have them elsewhere. Maybe something got moved around and you've lost the actual location of the dockerfile within the specific component (pms, worker, or orchestrator). The easiest thing to try would be to clone the repo again, configure the .env file per the template, and do a docker-compose up --build or your docker compose up command on a clean copy.

Also, you don't actually need to run this building from code, you can use the images that have already been built. Running it using the development dockerfiles is mostly for development and local testing. For an actual deployment with distributed transcoding you'd need different physical machines, possibly using an orchestrator such as docker swarm or kubernetes.

treverehrfurth commented 1 year ago

I originally tried to set this up with my plex instance in docker adding the docker mod and then setting up the other containers but it didnt work, nothing would play and I was getting a pms_ip or local relay error.

When I just cloned the repo and tweaked the docker-compse.yaml file, it worked, despite an error saying could do distributed transcoding, and that the ip address of the plex server was tied to a docker IP and I couldn't port forward it for remote access.

I have multiple servers in a HA proxmox cluster running multiple VM's and I want to have all my plex users on 1 server and send the transcode jobs out to the other servers. Is there a step by step install doc or vid I could use to set this all up properly? Or if you wouldn't mind, chatting with me on discord/showing me what i'm doing wrong? My discord is the same as my username: Pray4tre

treverehrfurth commented 1 year ago

This is the error I get when I just run plex with the dockermod installed on the plex docker image:

You must set either PMS_SERVICE or PMS_IP (either one), pointing to you Plex instance. PMS_SERVICE is only allowed in conjunction with LOCAL_RELAY_ENABLED='1'

For reference, here is my setup if it helps:

Plex docker stack with dockermod:

---
version: "2.1"
services:
  plex:
    image: lscr.io/linuxserver/plex:latest
    container_name: plex_bkup
    network_mode: host
    environment:
      - PUID=1000
      - PGID=1000
      - TZ=America/Chicago
      - VERSION=docker
      - PLEX_CLAIM=zfyKAr3r8wGGbzbxeHVH
      - DOCKER_MODS=ghcr.io/pabloromeo/clusterplex_dockermod:latest
    volumes:
      - /srv/dockerdata/clusterplex_pms/config:/config
      - /mnt/share/treshare/Torrents/Completed/TV:/tv
      - /mnt/share/treshare/Torrents/Completed/Movies:/movies
      - /mnt/share/treshare:/treshare
    restart: unless-stopped

Worker docker stack:

---
version: "2.1"
services:
  plex:
    image: lscr.io/linuxserver/plex:latest
    container_name: clusterplex_worker_bkup
    network_mode: host
    environment:
      - PUID=1000
      - PGID=1000
      - TZ=America/Chicago
      - VERSION=docker
      - PLEX_CLAIM=zfyKAr3r8wGGbzbxeHVH
      - DOCKER_MODS=ghcr.io/pabloromeo/clusterplex_worker_dockermod:latest
    volumes:
      - /srv/dockerdata/clusterplex_pms/config:/config
      - /mnt/share/treshare/Torrents/Completed/TV:/tv
      - /mnt/share/treshare/Torrents/Completed/Movies:/movies
      - /mnt/share/treshare:/treshare
    restart: unless-stopped

I have the orchestrator setup using this image: ghcr.io/pabloromeo/clusterplex_orchestrator:latest

pabloromeo commented 1 year ago

I believe you are missing configuration settings to be passed in as environment variables to th plex service. See this example, from the docker-compose.yaml: https://github.com/pabloromeo/clusterplex/blob/master/docker-compose.yaml

      ORCHESTRATOR_URL: http://plex-orchestrator:3500
      PMS_SERVICE: plex
      PMS_PORT: "32400"
      TRANSCODE_OPERATING_MODE: both #(local|remote|both)
      TRANSCODER_VERBOSE: "1"   # 1=verbose, 0=silent
      LOCAL_RELAY_ENABLED: "1"
      LOCAL_RELAY_PORT: "32499"
treverehrfurth commented 1 year ago

I added that to my docker plex container, now it looks like this:

---
version: "2.1"
services:
  plex:
    image: lscr.io/linuxserver/plex:latest
    container_name: plex
    network_mode: host
    environment:
      - PUID=1000
      - PGID=1000
      - TZ=America/Chicago
      - VERSION=docker
      - PLEX_CLAIM=zfyKAr3r8wGGbzbxeHVH
      - DOCKER_MODS=ghcr.io/pabloromeo/clusterplex_dockermod:latest
      - ORCHESTRATOR_URL=http://plex-orchestrator:3500
      - PMS_SERVICE=plex
      - PMS_PORT=32400
      - TRANSCODE_OPERATING_MODE=both #(local|remote|both)
      - TRANSCODER_VERBOSE=1   # 1=verbose, 0=silent
      - LOCAL_RELAY_ENABLED=1
      - LOCAL_RELAY_PORT=32499
    volumes:
      - /srv/dockerdata/clusterplex_pms/config:/config
      - /mnt/share/treshare/Torrents/Completed/TV:/tv
      - /mnt/share/treshare/Torrents/Completed/Movies:/movies
      - /mnt/share/treshare:/treshare
    restart: unless-stopped

It starts fine, but attempting to play any content it just sits there and circles like it did before. Not sure if its an issue with orchestrator or worker.

I then tried updating the worker in portainer to this:

---
version: "2.1"
services:
  plex:
    image: lscr.io/linuxserver/plex:latest
    container_name: clusterplex_worker_bkup
    network_mode: host
    environment:
      - PUID=1000
      - PGID=1000
      - TZ=America/Chicago
      - VERSION=docker
      - PLEX_CLAIM=zfyKAr3r8wGGbzbxeHVH
      - DOCKER_MODS=ghcr.io/pabloromeo/clusterplex_worker_dockermod:latest
      - LISTENING_PORT=3501      # used by the healthcheck
      - STAT_CPU_INTERVAL=2000   # interval for reporting worker load metrics
      - ORCHESTRATOR_URL=http://plex-orchestrator:3500
      - EAE_SUPPORT=1
    volumes:
      - /srv/dockerdata/clusterplex_pms/config:/config
      - /mnt/share/treshare/Torrents/Completed/TV:/tv
      - /mnt/share/treshare/Torrents/Completed/Movies:/movies
      - /mnt/share/treshare:/treshare
    restart: unless-stopped

and getting same results.

Could it be because my transcoder temporary directory in plex is just blank for default?

I'm really stuck on this and would appreciate if you or anyone could hop on a quick discord chat/call and help me get this running. Weeks in, lost track of how many ways i've tried configuring this and haven't got it to work yet.

treverehrfurth commented 1 year ago

I am finding that it is creating plex-transcode folders in my transcode directory, but still not playing anything, it just spins and the folder itself is empty.

pabloromeo commented 1 year ago

Well, I suspect the problem is that you are trying to run these as standalone services on remote machines. And the thing is, the configuration examples are for Orchestrators that run on distributed clusters, such as Docker Swarm or Kubernetes. For example your configs define that Plex (and workers) should connect to the ClusterPlex Orchestrator using http://plex-orchestrator:3500. However, if you run these services as just standalone compose files, that name will not resolve to anything, remote containers will not be aware of eachother. So I'm guessing for something like this you're going to have to expose the port 3500 on the orchestrator, and then reference it by IP or some name that actually resolves to the machine were you are exposing that port. (if you run them as network=host you don't need to do port mapping, but you'll still need to reference by IP). The same will happen with the Plex server settings:

      - PMS_SERVICE=plex
      - PMS_PORT=32400

The name plex will not resolve to anything remotely, so you'll have to change that to use the PMS_IP environment variable instead, and put the IP of the machine that will be running the main Plex PMS, for them to be able to communicate.

treverehrfurth commented 1 year ago

You are correct in that I am trying to run these as standalone servies, however i've only tried testing this on 1 machine. On a single docker instance.... not multiple. And they still aren't communicating. Any ideas on why that would be?

I've never used docker swarm or kubernetes yet but would like to not throw on another variable into the mix...

pabloromeo commented 1 year ago

Well, my guess is that even though you are running them all on one machine you are using separate compose files, which if I remember correctly is creating a separate network for each one, which is what is used for dns name resolution. Therefore what I mentioned before about service names across compose files would still apply. They won't resolve as they don't know about eachother. But since you are using host network_mode you can specify the machine IP and have them connect that way. Just beware that using a single machine for plex and a worker won't provide any benefit, it's only useful for local development and testing.

treverehrfurth commented 1 year ago

Ah that makes a little more sense with the networks not working with each other possibly being the problem. Again, new to docker in general as of a couple months ago as well as Proxmox, TrueNAS Scale, and VM's as a whole.

If I reference by network_mode=host like I am doing, I don't need to list any port mappings such as blocks like this from the docker-compose.yaml file above?

    ports:
      - 32499:32499   # LOCAL_RELAY_PORT
      - 50000:32400
      - 3005:3005
      - 8324:8324
      - 1900:1900/udp
      - 32410:32410/udp
      - 32412:32412/udp
      - 32413:32413/udp
      - 32414:32414/udp

I will remove those and reference them by IP (would that just be the IP of the machine such as 192.168.1.25? which in this case would be the same for all the containers?)

If that doesn't work I'll try running with the original docker-compose.yaml file tweaked for my own directories as well as removing the parts i'm guessing I don't need such as (if I'm correct):

 build:
      context: ./pms
      dockerfile: ./extended-image/Dockerfile-development
.
.
.
 healthcheck:
      test: curl -fsS http://localhost:32400/identity > /dev/null || exit 1
      interval: 15s
      timeout: 15s
      retries: 5
      start_period: 30s

I also realize this won't provide any benefit on a single machine, but I should be able to add worker jobs on one of my other 4 physical servers I have in a Proxmox HA cluster and make use of their GTX 1080TI's for balancing hardware transcoding. That's my end goal. Just trying to get this to work on one machine first before adding yet another variable when I'm barely grasping this installation so far on the software i'm currently running.

I appreciate you taking the time to stumble through this and help me. Hopefully this can help someone else with a current setup who's also struggling.

pabloromeo commented 1 year ago

Sure thing :) Glad to help if I can. Regarding your network host question, that's correct, you don't need to specify port mapping, since the container will be technically exposing the ports it needs on the actual host. That can cause issues though, if some of those ports are already in use in that host.

Regarding your example about removing stuff you don't need. I'd recommend instead of starting from the docker-compose that builds everything from code, start from the documented examples within the repo itself, for example: Using dockermods: https://github.com/pabloromeo/clusterplex/blob/master/docs/docker-swarm/with-dockermods.yaml

Or using the custom images: https://github.com/pabloromeo/clusterplex/blob/master/docs/docker-swarm/with-custom-images.yaml

Also, don't remove the Health-checks, they are useful for docker to be able to monitor and restart containers if needed. Ultimately it's how docker determines that the container is up, running and ready.

treverehrfurth commented 1 year ago

Okay I think I'm really cloes! I ran a stack using that dockermods yaml file in portainer. plex, orchestrator and worker all deployed and seem to be working. But now I'm struggling to get it forwarded. It's now using a docker generated ip address instead of just the local machine's and I haven't been able to figure out how that should be port forwarded in either docker/portainer or my router (ubiquiti dream machine). I'm a nub on networking in docker

treverehrfurth commented 1 year ago

Maybe this will help, can you change or show me what exactly needs to be changed? I added network_mode: host and have tried various ways of entering my PMS_IP but they aren't communicating yet.

So far for PMS_IP I have tried the following: http://localhost 192.168.1.147 http://192.168.1.147

But maybe i'm missing another spot to update?

version: '3.8'

services:
  plex:
    image: ghcr.io/linuxserver/plex:latest
    deploy:
      mode: replicated
      replicas: 1
    network_mode: host
    environment:
      DOCKER_MODS: "ghcr.io/pabloromeo/clusterplex_dockermod:latest"
      VERSION: docker
      PUID: 1000
      PGID: 1000
      TZ: America/Chicago
      ORCHESTRATOR_URL: http://plex-orchestrator:3500
      PMS_IP: http://192.168.1.147     # This service. If you disable Local Relay then you must use PMS_IP instead
      PMS_PORT: "32400"
      TRANSCODE_OPERATING_MODE: both #(local|remote|both)
      TRANSCODER_VERBOSE: "1"   # 1=verbose, 0=silent
      LOCAL_RELAY_ENABLED: "1"
      LOCAL_RELAY_PORT: "32499"
    healthcheck:
      test: curl -fsS http://localhost:32400/identity > /dev/null || exit 1
      interval: 15s
      timeout: 15s
      retries: 5
      start_period: 30s
    volumes:
      - /srv/dockerdata/plex/config:/config
      - /mnt/share/treshare/Torrents/Completed/TV:/tv
      - /mnt/share/treshare/Torrents/Completed/Movies:/movies
      - /mnt/share/treshare/transcode/plex:/transcode
    ports:
      - 32499:32499     # LOCAL_RELAY_PORT
      - 32400:32400
      - 3005:3005
      - 8324:8324
      - 1900:1900/udp
      - 32410:32410/udp
      - 32412:32412/udp
      - 32413:32413/udp
      - 32414:32414/udp

  plex-orchestrator:
    image: ghcr.io/pabloromeo/clusterplex_orchestrator:latest
    deploy:
      mode: replicated
      replicas: 1
      update_config:
        order: start-first
    healthcheck:
      test: curl -fsS http://localhost:3500/health > /dev/null || exit 1
      interval: 15s
      timeout: 15s
      retries: 5
      start_period: 30s
    environment:
      TZ: America/Chicago
      LISTENING_PORT: 3500
      WORKER_SELECTION_STRATEGY: "LOAD_RANK" # RR | LOAD_CPU | LOAD_TASKS | LOAD_RANK (default)
    volumes:
      - /etc/localtime:/etc/localtime:ro
    ports:
      - 3500:3500

  plex-worker:
    image: ghcr.io/linuxserver/plex:latest
    hostname: "plex-worker-{{.Node.Hostname}}"
    deploy:
      mode: replicated
      replicas: 2
    environment:
      DOCKER_MODS: "ghcr.io/pabloromeo/clusterplex_worker_dockermod:latest"
      VERSION: docker
      PUID: 1000
      PGID: 1000
      TZ: America/Chicago
      LISTENING_PORT: 3501      # used by the healthcheck
      STAT_CPU_INTERVAL: 2000   # interval for reporting worker load metrics
      ORCHESTRATOR_URL: http://plex-orchestrator:3500
      EAE_SUPPORT: "1"
    healthcheck:
      test: curl -fsS http://localhost:3501/health > /dev/null || exit 1
      interval: 15s
      timeout: 15s
      retries: 5
      start_period: 240s
    volumes:
      - /path/to/codecs:/codecs # (optional)
      - /mnt/share/treshare/Torrents/Completed/TV:/tv
      - /mnt/share/treshare/Torrents/Completed/Movies:/movies
      - /mnt/share/treshare/transcode/plex:/transcode
treverehrfurth commented 1 year ago

I think I may have got it working! With this:

version: '3.8'

services:
  plex:
    image: ghcr.io/linuxserver/plex:latest
    deploy:
      mode: replicated
      replicas: 1
    network_mode: host
    environment:
      DOCKER_MODS: "ghcr.io/pabloromeo/clusterplex_dockermod:latest"
      VERSION: docker
      PUID: 1000
      PGID: 1000
      TZ: America/Chicago
      ORCHESTRATOR_URL: http://192.168.1.147:3500
      PMS_IP: 192.168.1.147     # This service. If you disable Local Relay then you must use PMS_IP instead
      PMS_PORT: "32400"
      TRANSCODE_OPERATING_MODE: both #(local|remote|both)
      TRANSCODER_VERBOSE: "1"   # 1=verbose, 0=silent
      LOCAL_RELAY_ENABLED: "1"
      LOCAL_RELAY_PORT: "32499"
    healthcheck:
      test: curl -fsS http://192.168.1.147:32400/identity > /dev/null || exit 1
      interval: 15s
      timeout: 15s
      retries: 5
      start_period: 30s
    volumes:
      - /srv/dockerdata/plex/config:/config
      - /mnt/share/treshare/Torrents/Completed/TV:/tv
      - /mnt/share/treshare/Torrents/Completed/Movies:/movies
      - /mnt/share/treshare/transcode/plex:/transcode
    ports:
      - 32499:32499     # LOCAL_RELAY_PORT
      - 32400:32400
      - 3005:3005
      - 8324:8324
      - 1900:1900/udp
      - 32410:32410/udp
      - 32412:32412/udp
      - 32413:32413/udp
      - 32414:32414/udp

  plex-orchestrator:
    image: ghcr.io/pabloromeo/clusterplex_orchestrator:latest
    deploy:
      mode: replicated
      replicas: 1
      update_config:
        order: start-first
    healthcheck:
      test: curl -fsS http://192.168.1.147:3500/health > /dev/null || exit 1
      interval: 15s
      timeout: 15s
      retries: 5
      start_period: 30s
    environment:
      TZ: America/Chicago
      LISTENING_PORT: 3500
      WORKER_SELECTION_STRATEGY: "LOAD_RANK" # RR | LOAD_CPU | LOAD_TASKS | LOAD_RANK (default)
    volumes:
      - /etc/localtime:/etc/localtime:ro
    ports:
      - 3500:3500

  plex-worker:
    image: ghcr.io/linuxserver/plex:latest
    hostname: "plex-worker-{{.Node.Hostname}}"
    deploy:
      mode: replicated
      replicas: 2
    environment:
      DOCKER_MODS: "ghcr.io/pabloromeo/clusterplex_worker_dockermod:latest"
      VERSION: docker
      PUID: 1000
      PGID: 1000
      TZ: America/Chicago
      LISTENING_PORT: 3501      # used by the healthcheck
      STAT_CPU_INTERVAL: 2000   # interval for reporting worker load metrics
      ORCHESTRATOR_URL: http://192.168.1.147:3500
      EAE_SUPPORT: "1"
    healthcheck:
      test: curl -fsS http://192.168.1.147:3501/health > /dev/null || exit 1
      interval: 15s
      timeout: 15s
      retries: 5
      start_period: 240s
    volumes:
      - /path/to/codecs:/codecs # (optional)
      - /mnt/share/treshare/Torrents/Completed/TV:/tv
      - /mnt/share/treshare/Torrents/Completed/Movies:/movies
      - /mnt/share/treshare/transcode/plex:/transcode

Now my only question is why does it spin up 2 worker containers? Is that expected behavior? And that in portainer, they containers show as unhealthy constantly for both workers

pabloromeo commented 1 year ago

It spins up 2 workers because that's what is specified in your YAML:

    deploy:
      mode: replicated
      replicas: 2

You can set replicas to 1 if you like.

None of the containers should be unhealthy. I'd check the logs, i'm sure there's useful info there related to why they are unhealthy.

One thing to note: The first time you start a worker, if the plex codecs aren't present they will all be downloaded, which can take a minute, until the process completes and healthchecks start passing.

treverehrfurth commented 1 year ago

Ah thank you for pointing the replicas out, that is why.

As for the unhealthy containers, it seems that it only registers as healthy when I set the ip to localhost vs the direct IP. Which means running the workers on any remote node for me in this configuration results in all remote node workers displaying unhealthy when they are running fine.

unhealthy worker on remote node:

version: '3.8'

services:
  plex-worker:
    image: ghcr.io/linuxserver/plex:latest
    hostname: "plex-worker-{{.Node.Hostname}}"
    deploy:
      mode: replicated
      replicas: 2
    environment:
      DOCKER_MODS: "ghcr.io/pabloromeo/clusterplex_worker_dockermod:latest"
      VERSION: docker
      PUID: 1000
      PGID: 1000
      TZ: America/Chicago
      LISTENING_PORT: 3501      # used by the healthcheck
      STAT_CPU_INTERVAL: 2000   # interval for reporting worker load metrics
      ORCHESTRATOR_URL: http://192.168.1.71:3500
      EAE_SUPPORT: "1"
    healthcheck:
      test: curl -fsS http://192.168.1.71:3501/health > /dev/null || exit 1
      interval: 15s
      timeout: 15s
      retries: 5
      start_period: 240s
    volumes:
      - /mnt/share/dockerdata/plex/codecs:/codecs # (optional)
      - /mnt/share/dockerdata/dizquetv:/dizquetv
      - /mnt/share/treshare/Torrents/Completed/TV:/tv
      - /mnt/share/treshare/Torrents/Completed/Movies:/movies
      - /mnt/share/treshare/transcode/plex:/transcode
    restart: unless-stopped

Healthy worker on same node as orchestrator/pms:

  plex-worker:
    image: ghcr.io/linuxserver/plex:latest
    hostname: "plex-worker-{{.Node.Hostname}}"
    deploy:
      mode: replicated
      replicas: 2
    environment:
      DOCKER_MODS: "ghcr.io/pabloromeo/clusterplex_worker_dockermod:latest"
      VERSION: docker
      PUID: 1000
      PGID: 1000
      TZ: America/Chicago
      LISTENING_PORT: 3501      # used by the healthcheck
      STAT_CPU_INTERVAL: 2000   # interval for reporting worker load metrics
      ORCHESTRATOR_URL: http://localhost:3500
      EAE_SUPPORT: "1"
    healthcheck:
      test: curl -fsS http://localhost:3501/health > /dev/null || exit 1
      interval: 15s
      timeout: 15s
      retries: 5
      start_period: 240s
    volumes:
      - /mnt/share/dockerdata/plex/codecs:/codecs # (optional)
      - /mnt/share/dockerdata/dizquetv:/dizquetv
      - /mnt/share/treshare/Torrents/Completed/TV:/tv
      - /mnt/share/treshare/Torrents/Completed/Movies:/movies
      - /mnt/share/treshare:/treshare
      - /mnt/share/treshare/transcode/plex:/transcode
    restart: unless-stopped
pabloromeo commented 1 year ago

Healthchecks are run by docker itself and only concern it's own service, so localhost is the way to go, no need to use an external ip. Especially given that workers don't expose their listening port so attempting to connect to an ip outside of its internal virtual network might just receive a connection refused and fail the healthcheck. For the orchestrator url setting though it's perfectly fine to use a remote ip, hosts name or dns name, as needed.