Add docker health checks - Githubissues

epag commented 3 weeks ago

Author Name: Jesse (Jesse) Original Redmine Issue: 85103, https://vlab.noaa.gov/redmine/issues/85103 Original Date: 2020-11-18

Docker health checks to ensure what you see in the below checklist

Redmine related issue(s): 85546, 85803

epag commented 3 weeks ago

Original Redmine Comment Author Name: James (James) Original Date: 2020-11-18T15:54:35Z

eventsbroker/s are up before graphics/es eventsbroker/s are up before worker/s

( although there is resilience in terms of connection retries, both within docker and within the contained app )

epag commented 3 weeks ago

Original Redmine Comment Author Name: James (James) Original Date: 2020-11-23T14:03:44Z

Sort of related to this ticket in the sense it involves a check for something that stops the instantiation of the service when missing. Might be nice to conditionally fail on running the worker container when the @WRES_ENV_SUFFIX@ has not been set because the resulting "connection reset by peer" error is pretty opaque in relation to the missing variable. Alternatively, perhaps we could add an environment variable file (@.env@) with defaults in the service root dir, which is picked up automatically? Or both?

epag commented 3 weeks ago

Original Redmine Comment Author Name: Jesse (Jesse) Original Date: 2020-11-23T15:02:24Z

At some level (in this case the cluster mode administration) some knowledge of the system and general troubleshooting skills are needed. We shouldn't waste too much time making the cluster mode easier.

epag commented 3 weeks ago

Original Redmine Comment Author Name: Jesse (Jesse) Original Date: 2020-11-23T15:03:44Z

The reason we cannot commit a realistic @.env@ is that ITSG believes that exposing hostnames is apparently a no-no. So we need to remove even the pattern, eventually, for the purpose of publication of the software.

epag commented 3 weeks ago

Original Redmine Comment Author Name: James (James) Original Date: 2020-11-23T15:13:12Z

At some level, yes. But I annotate code to help other developers, so this is surely about the "some level". I don't see this as "timewasting". I see it as fixing a bug because there is no information about the origin of the problem.

Fair enough regarding exposing things in the repo. However, I think we can have a separate repo for stuff that we don't want to expose, which would be a private repo if we moved to github.

epag commented 3 weeks ago

Original Redmine Comment Author Name: Chris (Chris) Original Date: 2020-11-23T15:16:37Z

One of the approaches I've been taking lately is the generation of these sorts of files via a separate build script included within the repo. The file is pretty static, so there's generally a template, then command line arguments ask for the stuff we can't embed and sticks them in. The https://github.com/NOAA-OWP/wres_output_service/blob/master/collect.py script in the output service is an ok example; it doesn't take a lot of inputs, though. The hydrographer is a better example, but that code hasn't been pushed yet.

Is that an approach that could be used here?

epag commented 3 weeks ago

Original Redmine Comment Author Name: Chris (Chris) Original Date: 2020-12-03T15:17:06Z

I'm curious as to whether the host needs some sort of health check first. It sounds an awful lot like the docker daemon is being launched before everything has been mounted. It wouldn't be the first time these systems were doing something goofy like that.

epag commented 3 weeks ago

Original Redmine Comment Author Name: Jesse (Jesse) Original Date: 2020-12-03T15:28:15Z

I don't know if health checks -won't- will solve the last checkbox ("/home directory is accessible by workers") alone, but maybe there's an option when mounting a volume to guarantee it's an existing volume rather than a volume to be created, something like that or something like that combined with a health check. Would that work?

epag commented 3 weeks ago

Original Redmine Comment Author Name: Jesse (Jesse) Original Date: 2021-01-25T22:24:14Z

Starting with "broker is up before tasker"

epag commented 3 weeks ago

Original Redmine Comment Author Name: Jesse (Jesse) Original Date: 2021-01-25T22:46:23Z

Following https://github.com/docker-library/healthcheck/tree/master/rabbitmq I see that docker shows the broker as starting then unhealthy, though it started and is healthy. Hmm.

epag commented 3 weeks ago

Original Redmine Comment Author Name: Jesse (Jesse) Original Date: 2021-01-25T22:48:53Z

No execute permission on the healthcheck script.

epag commented 3 weeks ago

Original Redmine Comment Author Name: Jesse (Jesse) Original Date: 2021-01-25T23:00:51Z

Docker seems to show the health correctly now: @43aa3dae24fe 93fa0c96b2b7 "docker-entrypoint.s…" 30 seconds ago Up 29 seconds (health: starting) 4369/tcp, 0.0.0.0:5671->5671/tcp, 5672/tcp, 15672/tcp, 15691-15692/tcp, 25672/tcp, 0.0.0.0:15671->15671/tcp deployment_broker_1@ ... @43aa3dae24fe 93fa0c96b2b7 "docker-entrypoint.s…" 39 seconds ago Up 37 seconds (healthy) 4369/tcp, 0.0.0.0:5671->5671/tcp, 5672/tcp, 15672/tcp, 15691-15692/tcp, 25672/tcp, 0.0.0.0:15671->15671/tcp deployment_broker_1@

epag commented 3 weeks ago

Original Redmine Comment Author Name: Jesse (Jesse) Original Date: 2021-01-25T23:02:00Z

No magic auto-checking-for-healthy-or-startup in docker-compose, must be explicit, apparently.

epag commented 3 weeks ago

Original Redmine Comment Author Name: Jesse (Jesse) Original Date: 2021-01-25T23:13:09Z

Satisfying to see it wait for the broker now.

[deployment]$ diff docker-compose-all-roles-20210119-8b881fe.yml docker-compose-all-roles-20210119-8b881fe_broker_with_healthcheck.yml
30,31c30,33
<          - "broker"
<          - "persister"
---
>             broker:
>                 condition: service_healthy
>             persister:
>                 condition: service_started
52c54
<         image: "${DOCKER_REGISTRY}/wres/wres-broker:20210114-7f78a93"
---
>         image: "93fa0c96b2b7"
69c71,72
<          - "broker"
---
>             broker:
>                 condition: service_healthy

[deployment]$ sudo docker run -d -v /var/run/docker.sock:/var/run/docker.sock -v "$PWD:$PWD" -w "$PWD" --cap-drop ALL --cpus 2 --memory 512M docker/compose:1.27.4 --file docker-compose-all-roles-20210119-8b881fe_broker_with_healthcheck.yml up --scale worker=4
5bdfdf115d62e2ec4ff81ad9bf5363f4a0343207342e99cf4400dab602845b56
[deployment]$ docker ps
CONTAINER ID   IMAGE                   COMMAND                  CREATED         STATUS        PORTS     NAMES
5bdfdf115d62   docker/compose:1.27.4   "sh /usr/local/bin/d…"   2 seconds ago   Up 1 second             magical_grothendieck
[deployment]$ docker ps
CONTAINER ID   IMAGE                     COMMAND                  CREATED         STATUS                                     PORTS                                                                                                         NAMES
bfbd06d42779   93fa0c96b2b7              "docker-entrypoint.s…"   1 second ago    Up Less than a second (health: starting)   4369/tcp, 0.0.0.0:5671->5671/tcp, 5672/tcp, 15672/tcp, 15691-15692/tcp, 25672/tcp, 0.0.0.0:15671->15671/tcp   deployment_broker_1
ee0f8e38332e   redis:6.0.10-alpine3.12   "docker-entrypoint.s…"   1 second ago    Up Less than a second                      6379/tcp                                                                                                      deployment_persister_1
5bdfdf115d62   docker/compose:1.27.4     "sh /usr/local/bin/d…"   3 seconds ago   Up 3 seconds                                                                                                                                             magical_grothendieck
[deployment]$ docker ps
CONTAINER ID   IMAGE                     COMMAND                  CREATED         STATUS                            PORTS                                                                                                         NAMES
bfbd06d42779   93fa0c96b2b7              "docker-entrypoint.s…"   3 seconds ago   Up 2 seconds (health: starting)   4369/tcp, 0.0.0.0:5671->5671/tcp, 5672/tcp, 15672/tcp, 15691-15692/tcp, 25672/tcp, 0.0.0.0:15671->15671/tcp   deployment_broker_1
ee0f8e38332e   redis:6.0.10-alpine3.12   "docker-entrypoint.s…"   3 seconds ago   Up 2 seconds                      6379/tcp                                                                                                      deployment_persister_1
5bdfdf115d62   docker/compose:1.27.4     "sh /usr/local/bin/d…"   5 seconds ago   Up 4 seconds                                                                                                                                    magical_grothendieck
[deployment]$ docker ps
CONTAINER ID   IMAGE                     COMMAND                  CREATED         STATUS                            PORTS                                                                                                         NAMES
bfbd06d42779   93fa0c96b2b7              "docker-entrypoint.s…"   6 seconds ago   Up 5 seconds (health: starting)   4369/tcp, 0.0.0.0:5671->5671/tcp, 5672/tcp, 15672/tcp, 15691-15692/tcp, 25672/tcp, 0.0.0.0:15671->15671/tcp   deployment_broker_1
ee0f8e38332e   redis:6.0.10-alpine3.12   "docker-entrypoint.s…"   6 seconds ago   Up 5 seconds                      6379/tcp                                                                                                      deployment_persister_1
5bdfdf115d62   docker/compose:1.27.4     "sh /usr/local/bin/d…"   8 seconds ago   Up 7 seconds                                                                                                                                    magical_grothendieck
[deployment]$ docker ps
CONTAINER ID   IMAGE                     COMMAND                  CREATED          STATUS                            PORTS                                                                                                         NAMES
bfbd06d42779   93fa0c96b2b7              "docker-entrypoint.s…"   8 seconds ago    Up 7 seconds (health: starting)   4369/tcp, 0.0.0.0:5671->5671/tcp, 5672/tcp, 15672/tcp, 15691-15692/tcp, 25672/tcp, 0.0.0.0:15671->15671/tcp   deployment_broker_1
ee0f8e38332e   redis:6.0.10-alpine3.12   "docker-entrypoint.s…"   8 seconds ago    Up 7 seconds                      6379/tcp                                                                                                      deployment_persister_1
5bdfdf115d62   docker/compose:1.27.4     "sh /usr/local/bin/d…"   10 seconds ago   Up 9 seconds                                                                                                                                    magical_grothendieck
[deployment]$ docker ps
CONTAINER ID   IMAGE                     COMMAND                  CREATED          STATUS                             PORTS                                                                                                         NAMES
bfbd06d42779   93fa0c96b2b7              "docker-entrypoint.s…"   11 seconds ago   Up 10 seconds (health: starting)   4369/tcp, 0.0.0.0:5671->5671/tcp, 5672/tcp, 15672/tcp, 15691-15692/tcp, 25672/tcp, 0.0.0.0:15671->15671/tcp   deployment_broker_1
ee0f8e38332e   redis:6.0.10-alpine3.12   "docker-entrypoint.s…"   11 seconds ago   Up 10 seconds                      6379/tcp                                                                                                      deployment_persister_1
5bdfdf115d62   docker/compose:1.27.4     "sh /usr/local/bin/d…"   13 seconds ago   Up 12 seconds                                                                                                                                    magical_grothendieck
[deployment]$ docker logs 5bdfdf11
Creating network "deployment_wres_net" with driver "bridge"
Creating deployment_persister_1 ...
Creating deployment_broker_1    ...
[deployment]$ docker logs 5bdfdf11
Creating network "deployment_wres_net" with driver "bridge"
Creating deployment_persister_1 ...
Creating deployment_broker_1    ...
[deployment]$ docker ps
CONTAINER ID   IMAGE                     COMMAND                  CREATED          STATUS                             PORTS                                                                                                         NAMES
bfbd06d42779   93fa0c96b2b7              "docker-entrypoint.s…"   24 seconds ago   Up 23 seconds (health: starting)   4369/tcp, 0.0.0.0:5671->5671/tcp, 5672/tcp, 15672/tcp, 15691-15692/tcp, 25672/tcp, 0.0.0.0:15671->15671/tcp   deployment_broker_1
ee0f8e38332e   redis:6.0.10-alpine3.12   "docker-entrypoint.s…"   24 seconds ago   Up 23 seconds                      6379/tcp                                                                                                      deployment_persister_1
5bdfdf115d62   docker/compose:1.27.4     "sh /usr/local/bin/d…"   26 seconds ago   Up 25 seconds                                                                                                                                    magical_grothendieck
[deployment]$ docker logs 5bdfdf11
Creating network "deployment_wres_net" with driver "bridge"
Creating deployment_persister_1 ...
Creating deployment_broker_1    ...
[deployment]$ docker ps
CONTAINER ID   IMAGE                     COMMAND                  CREATED          STATUS                             PORTS                                                                                                         NAMES
bfbd06d42779   93fa0c96b2b7              "docker-entrypoint.s…"   29 seconds ago   Up 28 seconds (health: starting)   4369/tcp, 0.0.0.0:5671->5671/tcp, 5672/tcp, 15672/tcp, 15691-15692/tcp, 25672/tcp, 0.0.0.0:15671->15671/tcp   deployment_broker_1
ee0f8e38332e   redis:6.0.10-alpine3.12   "docker-entrypoint.s…"   29 seconds ago   Up 28 seconds                      6379/tcp                                                                                                      deployment_persister_1
5bdfdf115d62   docker/compose:1.27.4     "sh /usr/local/bin/d…"   31 seconds ago   Up 30 seconds                                                                                                                                    magical_grothendieck

The tasker and workers came up after it was truly up and good. Now looks like this:

[deployment]$ docker ps
CONTAINER ID   IMAGE                                                               COMMAND                  CREATED         STATUS                   PORTS                                                                                                         NAMES
a78f3e63fe83   nwcal-registry.[host]/wres/wres-worker:20210119-74698fa   "./docker-entrypoint…"   2 minutes ago   Up 2 minutes                                                                                                                           deployment_worker_1
ea39497e3a44   nwcal-registry.[host]/wres/wres-worker:20210119-74698fa   "./docker-entrypoint…"   2 minutes ago   Up 2 minutes                                                                                                                           deployment_worker_2
2f35fe726dda   nwcal-registry.[host]/wres/wres-worker:20210119-74698fa   "./docker-entrypoint…"   2 minutes ago   Up 2 minutes                                                                                                                           deployment_worker_4
54e619385fc4   nwcal-registry.[host]/wres/wres-worker:20210119-74698fa   "./docker-entrypoint…"   2 minutes ago   Up 2 minutes                                                                                                                           deployment_worker_3
060c3a3faa23   nwcal-registry.[host]/wres/wres-tasker:20210114-7f78a93   "bin/wres-tasker"        2 minutes ago   Up 2 minutes             0.0.0.0:443->8443/tcp                                                                                         deployment_tasker_1
bfbd06d42779   93fa0c96b2b7                                                        "docker-entrypoint.s…"   3 minutes ago   Up 3 minutes (healthy)   4369/tcp, 0.0.0.0:5671->5671/tcp, 5672/tcp, 15672/tcp, 15691-15692/tcp, 25672/tcp, 0.0.0.0:15671->15671/tcp   deployment_broker_1
ee0f8e38332e   redis:6.0.10-alpine3.12                                             "docker-entrypoint.s…"   3 minutes ago   Up 3 minutes             6379/tcp                                                                                                      deployment_persister_1
5bdfdf115d62   docker/compose:1.27.4                                               "sh /usr/local/bin/d…"   3 minutes ago   Up 3 minutes                                                                                                                           magical_grothendieck

epag commented 3 weeks ago

Original Redmine Comment Author Name: Jesse (Jesse) Original Date: 2021-01-25T23:18:20Z

In order to add the health check to the redis image we need to separately build a redis image that wraps the existing one, adds the health check. We should probably have it run as @wres_docker@ as well instead of whatever it currently does.

epag commented 3 weeks ago

Original Redmine Comment Author Name: James (James) Original Date: 2021-01-25T23:54:58Z

On reflection, I wonder if it isn't better to avoid these dependencies being declared via docker-compose and any associated health conditionality. Each dependency should have it's own built-in resilience that should work regardless of the environment in which it is running and not only during service start-up. I believe this declaration is gone in the v3 docker-compose format anyway, so it won't port.

epag commented 3 weeks ago

Original Redmine Comment Author Name: Jesse (Jesse) Original Date: 2021-01-26T15:03:37Z

I'll discuss why I disagree after I do the work.

epag commented 3 weeks ago

Original Redmine Comment Author Name: Chris (Chris) Original Date: 2021-01-26T15:19:06Z

With the docker-compose spec version 2.1+, you can combine the @depends_on@ tag with the @healthcheck@ tag in your docker-compose.yml file to only launch service b after service a has started and has been declared healthy without having to write a wrapper image. This is usually the way to go with 3rd party images for finished products (redis, postgres, etc) since they have people focused on making sure that the image on its own is solid. It is then up to the docker-compose configuration to tie everything together, which is part of what it was made for. This somewhat limits the points of failure to misconfiguration and not in the image makeup. It's much easier to fix things like security gaps through smart deployment configuration rather than through static artifact generation.

This comment isn't intended to guide, just to highlight features available to us.

epag commented 3 weeks ago

Original Redmine Comment Author Name: Chris (Chris) Original Date: 2021-01-26T15:29:24Z

Something I just caught - version 3+ of compose stripped conditional @depends_on@ clauses. An alternate approach there is a start of script that checks dependent containers' health before continuing. This approach is generally used for containers relying on postgres containers.

epag commented 3 weeks ago

Original Redmine Comment Author Name: James (James) Original Date: 2021-01-26T15:47:14Z

Right, gone from 3+ compose file version. Might want to ask the question: why? (I don't know the answer, but we can search for their thinking... edit: but it's probably related to swarm and what resilience means in that new context).

But my version of the answer is that it's better to build resilience into the contained applications.

Startup covers only a fraction of failure modes. Another failure mode is that a container or containers or the contained applications within them die at runtime, in any number or sequence. If application B depends on application A and a startup health check enforces that B starts when A is healthy, but B is not otherwise resilient, then what happens when A dies at runtime? Or what happens when both A and B die at runtime and B restarts before A (does the docker-compose start-up sequence apply to subsets of containers at all times, including runtime, or only to all containers on start-up?). edit: or what happens when we deploy outside of docker?

Overall, the smell that I am trying to communicate is that these settings are as likely to obfuscate as assist - it is better for the applications to be resilient. The only resilience that containers should have is that they should restart upon failure. But, regarding the contained applications, if the worker starts before the eventsbroker or the graphics, that should be fine. If the eventsbroker starts after the worker and before the graphics and takes N minutes to become healthy, that should be fine. Etc. Regardless of docker.

epag commented 3 weeks ago

Original Redmine Comment Author Name: Chris (Chris) Original Date: 2021-01-26T16:18:51Z

I'm too tired to string together the sentences that express how much I wholeheartedly agree, so here's a "link":https://vlab.noaa.gov/redmine/projects/owp-wres-gui/repository/revisions/master/entry/entrypoint.sh to the the GUI entrypoint script that seems to do what you're referencing. That doesn't poll and pause the app server when postgres is considered unhealthy, though.

Might want to ask the question: why? (I don't know the answer, but we can search for their thinking... edit: but it's probably related to swarm and what resilience means in that new context).

Docker has been trying to switch people away from compose and into swarm, even for single node swarms. Swarm naturally waits to direct traffic based on dependent health, so it was rendered moot. You can run into situations where you have 15 replicas with only 8 being healthy so it still has to manage that behavior. Health checks are really important to swarm (almost a requirement, really), so I'm not surprised that they had to play with it.

epag commented 3 weeks ago

Original Redmine Comment Author Name: James (James) Original Date: 2021-01-26T16:28:26Z

Right - stuff like that. Build it into the contained applications, I say.

I am not against using additional docker tools if it smooths the way to something in docker or overcomes some bug (to be fixed) whereby the contained applications must start in a particular order, but I think that is a bug nonetheless - for any architecture composed of several microservices, you need to build in resilience out of the box before worrying about docker.

Makes sense re: swarm - I know nothing about it, but it makes sense that you don't send work to containers that are unhealthy (which is a separate issue from building order dependencies into restarts via docker - to be clear, health checks make perfect sense and we should have those).

epag commented 3 weeks ago

Original Redmine Comment Author Name: Jesse (Jesse) Original Date: 2021-01-26T16:45:17Z

The docs seem to indicate that there is an indefinite start period by default, however when the container takes more than about a minute to start up, the default 3 health checks with the default 20 second retries indicate "unhealthy", causing the startup sequence to stop. Using 2.3 format to see if specifying "0" explicitly (it's supposed to be 0 by default, meaning indefinite, I thought), works.

epag commented 3 weeks ago

Original Redmine Comment Author Name: Jesse (Jesse) Original Date: 2021-01-26T16:49:45Z

Confirmed by setting startup to 0, interval to 5s, timeout to 7s, retries to 2s, the unhealthy state comes more quickly in this case. So apparently the startup needs to be covered by the interval and retries as well.

epag commented 3 weeks ago

Original Redmine Comment Author Name: Jesse (Jesse) Original Date: 2021-01-26T17:28:08Z

Trying a short interval and small retries with a non-zero startup_period to see if that properly separates startup from healthy status at runtime. Yes, that appears to work as expected.

epag commented 3 weeks ago

Original Redmine Comment Author Name: Jesse (Jesse) Original Date: 2021-01-26T17:40:25Z

Kubernetes appears to nicely distinguish the aspects into startup, liveness, readiness: https://medium.com/avmconsulting-blog/how-to-perform-health-checks-in-kubernetes-k8s-a4e5300b1f9d

epag commented 3 weeks ago

Original Redmine Comment Author Name: Jesse (Jesse) Original Date: 2021-01-26T17:50:57Z

Diff of seemingly working compose:

[deployment]$ diff docker-compose-all-roles-20210119-8b881fe.yml docker-compose-all-roles-20210119-8b881fe_with_healthchecks.yml
1c1
< version: '2.2'
---
> version: '2.3'
13c13
<         image: "redis:6.0.10-alpine3.12"
---
>         image: "9329653d0aaa"
17c17
<          - /mnt/wres_share/job_data:/data
---
>          - /mnt/wres_share/job_data2:/data
23a24,26
>         # Allow 15+ minutes for startup before failing
>         healthcheck:
>             start_period: 15m
30,31c33,36
<          - "broker"
<          - "persister"
---
>             broker:
>                 condition: service_healthy
>             persister:
>                 condition: service_healthy
52c57
<         image: "${DOCKER_REGISTRY}/wres/wres-broker:20210114-7f78a93"
---
>         image: "93fa0c96b2b7"
64a70,72
>         # Allow 5 minutes for startup before failing
>         healthcheck:
>             start_period: 5m
69c77,78
<          - "broker"
---
>             broker:
>                 condition: service_healthy

epag commented 3 weeks ago

Original Redmine Comment Author Name: Jesse (Jesse) Original Date: 2021-01-26T21:45:51Z

The critical ones for 5.5 are ready (meaning ones that affect functionality of the software in certain cases), pending NCEP coming back online so I can commit them. Maybe the /home directory thing is better suited to #82962: use docker NFS volume or NFS from within the container. That should solve the "cannot access home after reboot" issue for the most part.

epag commented 3 weeks ago

Original Redmine Comment Author Name: Jesse (Jesse) Original Date: 2021-01-27T21:18:45Z

commit:c8f3b13be164794f00e9c74646b59cf91378f9a1 has the added healthchecks for redis and broker. Does not have changes for home or eventsbroker. Does not yet include the wres-redis image in the @scripts/dockerize.sh@ image build tool.

epag commented 3 weeks ago

Original Redmine Comment Author Name: Jesse (Jesse) Original Date: 2021-01-27T22:11:18Z

commit:a7eae95cb165fa1490ae1b0a8fa62961ed0e07d6 has updated @scripts/dockerize.sh@ with the redis image.

At least these two commits should make it into 5.5 but I don't know about the remainder.

epag commented 3 weeks ago

Original Redmine Comment Author Name: Jesse (Jesse) Original Date: 2021-01-27T23:43:44Z

Launching now waits for the redis instance to read its aof.

[deployment]$ sudo docker run -d -v /var/run/docker.sock:/var/run/docker.sock -v "$PWD:$PWD" -w "$PWD" --cap-drop ALL --cpus 2 --memory 512M docker/compose:1.27.4 --file docker-compose-all-roles-20210127-be74051.yml up --scale worker=2
8c4f6d66197ff29903ceafa34c3c0ece3bc67176593406196a3a3f3428cc0572
[deployment]$ docker logs 8c4f6d6619
Creating deployment_persister_1 ...
Creating deployment_broker_1    ...
[deployment]$ docker ps
CONTAINER ID   IMAGE                                                               COMMAND                  CREATED         STATUS                            PORTS                                                                                                         NAMES
ad3f4ccc9184   nwcal-registry.[host]/wres/wres-broker:20210127-d64a19d   "docker-entrypoint.s…"   6 seconds ago   Up 5 seconds (health: starting)   4369/tcp, 0.0.0.0:5671->5671/tcp, 5672/tcp, 15672/tcp, 15691-15692/tcp, 25672/tcp, 0.0.0.0:15671->15671/tcp   deployment_broker_1
44abf2d9f6e8   nwcal-registry.[host]/wres/wres-redis:20210127-d64a19d    "docker-entrypoint.s…"   6 seconds ago   Up 5 seconds (health: starting)   6379/tcp                                                                                                      deployment_persister_1
8c4f6d66197f   docker/compose:1.27.4                                               "sh /usr/local/bin/d…"   7 seconds ago   Up 7 seconds                                                                                                                                    youthful_keller
[deployment]$ docker logs 8c4f6d6619
Creating deployment_persister_1 ...
Creating deployment_broker_1    ...
[deployment]$ docker ps
CONTAINER ID   IMAGE                                                               COMMAND                  CREATED          STATUS                            PORTS                                                                                                         NAMES
ad3f4ccc9184   nwcal-registry.[host]/wres/wres-broker:20210127-d64a19d   "docker-entrypoint.s…"   10 seconds ago   Up 8 seconds (health: starting)   4369/tcp, 0.0.0.0:5671->5671/tcp, 5672/tcp, 15672/tcp, 15691-15692/tcp, 25672/tcp, 0.0.0.0:15671->15671/tcp   deployment_broker_1
44abf2d9f6e8   nwcal-registry.[host]/wres/wres-redis:20210127-d64a19d    "docker-entrypoint.s…"   10 seconds ago   Up 8 seconds (health: starting)   6379/tcp                                                                                                      deployment_persister_1
8c4f6d66197f   docker/compose:1.27.4                                               "sh /usr/local/bin/d…"   11 seconds ago   Up 10 seconds                                                                                                                                   youthful_keller
[deployment]$ docker logs deployment_persister_1
1:C 27 Jan 2021 23:42:00.424 # oO0OoO0OoO0Oo Redis is starting oO0OoO0OoO0Oo
1:C 27 Jan 2021 23:42:00.424 # Redis version=6.0.10, bits=64, commit=00000000, modified=0, pid=1, just started
1:C 27 Jan 2021 23:42:00.424 # Configuration loaded
1:M 27 Jan 2021 23:42:00.428 * Running mode=standalone, port=6379.
1:M 27 Jan 2021 23:42:00.428 # WARNING: The TCP backlog setting of 511 cannot be enforced because /proc/sys/net/core/somaxconn is set to the lower value of 128.
1:M 27 Jan 2021 23:42:00.428 # Server initialized
1:M 27 Jan 2021 23:42:00.428 # WARNING overcommit_memory is set to 0! Background save may fail under low memory condition. To fix this issue add 'vm.overcommit_memory = 1' to /etc/sysctl.conf and then reboot or run the command 'sysctl vm.overcommit_memory=1' for this to take effect.
1:M 27 Jan 2021 23:42:00.428 # WARNING you have Transparent Huge Pages (THP) support enabled in your kernel. This will create latency and memory usage issues with Redis. To fix this issue run the command 'echo madvise > /sys/kernel/mm/transparent_hugepage/enabled' as root, and add it to your /etc/rc.local in order to retain the setting after a reboot. Redis must be restarted after THP is disabled (set to 'madvise' or 'never').
1:M 27 Jan 2021 23:42:00.432 * Reading RDB preamble from AOF file...
1:M 27 Jan 2021 23:42:00.432 * Loading RDB produced by version 6.0.8
1:M 27 Jan 2021 23:42:00.432 * RDB age 6731941 seconds
1:M 27 Jan 2021 23:42:00.432 * RDB memory usage when created 659.14 Mb
1:M 27 Jan 2021 23:42:00.432 * RDB has an AOF tail
1:M 27 Jan 2021 23:42:02.624 * Reading the remaining AOF tail...
[deployment]$ docker ps
CONTAINER ID   IMAGE                                                               COMMAND                  CREATED          STATUS                             PORTS                                                                                                         NAMES
ad3f4ccc9184   nwcal-registry.[host]/wres/wres-broker:20210127-d64a19d   "docker-entrypoint.s…"   23 seconds ago   Up 21 seconds (health: starting)   4369/tcp, 0.0.0.0:5671->5671/tcp, 5672/tcp, 15672/tcp, 15691-15692/tcp, 25672/tcp, 0.0.0.0:15671->15671/tcp   deployment_broker_1
44abf2d9f6e8   nwcal-registry.[host]/wres/wres-redis:20210127-d64a19d    "docker-entrypoint.s…"   23 seconds ago   Up 21 seconds (health: starting)   6379/tcp                                                                                                      deployment_persister_1
8c4f6d66197f   docker/compose:1.27.4                                               "sh /usr/local/bin/d…"   24 seconds ago   Up 23 seconds                                                                                                                                    youthful_keller

NOAA-OWP / wres

Add docker health checks #240