Open epag opened 3 weeks ago
Original Redmine Comment Author Name: James (James) Original Date: 2020-11-18T15:54:35Z
eventsbroker/s are up before graphics/es eventsbroker/s are up before worker/s
( although there is resilience in terms of connection retries, both within docker and within the contained app )
Original Redmine Comment Author Name: James (James) Original Date: 2020-11-23T14:03:44Z
Sort of related to this ticket in the sense it involves a check for something that stops the instantiation of the service when missing. Might be nice to conditionally fail on running the worker container when the @WRES_ENV_SUFFIX@ has not been set because the resulting "connection reset by peer" error is pretty opaque in relation to the missing variable. Alternatively, perhaps we could add an environment variable file (@.env@) with defaults in the service root dir, which is picked up automatically? Or both?
Original Redmine Comment Author Name: Jesse (Jesse) Original Date: 2020-11-23T15:02:24Z
At some level (in this case the cluster mode administration) some knowledge of the system and general troubleshooting skills are needed. We shouldn't waste too much time making the cluster mode easier.
Original Redmine Comment Author Name: Jesse (Jesse) Original Date: 2020-11-23T15:03:44Z
The reason we cannot commit a realistic @.env@ is that ITSG believes that exposing hostnames is apparently a no-no. So we need to remove even the pattern, eventually, for the purpose of publication of the software.
Original Redmine Comment Author Name: James (James) Original Date: 2020-11-23T15:13:12Z
At some level, yes. But I annotate code to help other developers, so this is surely about the "some level". I don't see this as "timewasting". I see it as fixing a bug because there is no information about the origin of the problem.
Fair enough regarding exposing things in the repo. However, I think we can have a separate repo for stuff that we don't want to expose, which would be a private repo if we moved to github.
Original Redmine Comment Author Name: Chris (Chris) Original Date: 2020-11-23T15:16:37Z
One of the approaches I've been taking lately is the generation of these sorts of files via a separate build script included within the repo. The file is pretty static, so there's generally a template, then command line arguments ask for the stuff we can't embed and sticks them in. The https://github.com/NOAA-OWP/wres_output_service/blob/master/collect.py script in the output service is an ok example; it doesn't take a lot of inputs, though. The hydrographer is a better example, but that code hasn't been pushed yet.
Is that an approach that could be used here?
Original Redmine Comment Author Name: Chris (Chris) Original Date: 2020-12-03T15:17:06Z
I'm curious as to whether the host needs some sort of health check first. It sounds an awful lot like the docker daemon is being launched before everything has been mounted. It wouldn't be the first time these systems were doing something goofy like that.
Original Redmine Comment Author Name: Jesse (Jesse) Original Date: 2020-12-03T15:28:15Z
I don't know if health checks -won't- will solve the last checkbox ("/home directory is accessible by workers") alone, but maybe there's an option when mounting a volume to guarantee it's an existing volume rather than a volume to be created, something like that or something like that combined with a health check. Would that work?
Original Redmine Comment Author Name: Jesse (Jesse) Original Date: 2021-01-25T22:24:14Z
Starting with "broker is up before tasker"
Original Redmine Comment Author Name: Jesse (Jesse) Original Date: 2021-01-25T22:46:23Z
Following https://github.com/docker-library/healthcheck/tree/master/rabbitmq I see that docker shows the broker as starting then unhealthy, though it started and is healthy. Hmm.
Original Redmine Comment Author Name: Jesse (Jesse) Original Date: 2021-01-25T22:48:53Z
No execute permission on the healthcheck script.
Original Redmine Comment Author Name: Jesse (Jesse) Original Date: 2021-01-25T23:00:51Z
Docker seems to show the health correctly now: @43aa3dae24fe 93fa0c96b2b7 "docker-entrypoint.s…" 30 seconds ago Up 29 seconds (health: starting) 4369/tcp, 0.0.0.0:5671->5671/tcp, 5672/tcp, 15672/tcp, 15691-15692/tcp, 25672/tcp, 0.0.0.0:15671->15671/tcp deployment_broker_1@ ... @43aa3dae24fe 93fa0c96b2b7 "docker-entrypoint.s…" 39 seconds ago Up 37 seconds (healthy) 4369/tcp, 0.0.0.0:5671->5671/tcp, 5672/tcp, 15672/tcp, 15691-15692/tcp, 25672/tcp, 0.0.0.0:15671->15671/tcp deployment_broker_1@
Original Redmine Comment Author Name: Jesse (Jesse) Original Date: 2021-01-25T23:02:00Z
No magic auto-checking-for-healthy-or-startup in docker-compose, must be explicit, apparently.
Original Redmine Comment Author Name: Jesse (Jesse) Original Date: 2021-01-25T23:13:09Z
Satisfying to see it wait for the broker now.
[deployment]$ diff docker-compose-all-roles-20210119-8b881fe.yml docker-compose-all-roles-20210119-8b881fe_broker_with_healthcheck.yml
30,31c30,33
< - "broker"
< - "persister"
---
> broker:
> condition: service_healthy
> persister:
> condition: service_started
52c54
< image: "${DOCKER_REGISTRY}/wres/wres-broker:20210114-7f78a93"
---
> image: "93fa0c96b2b7"
69c71,72
< - "broker"
---
> broker:
> condition: service_healthy
[deployment]$ sudo docker run -d -v /var/run/docker.sock:/var/run/docker.sock -v "$PWD:$PWD" -w "$PWD" --cap-drop ALL --cpus 2 --memory 512M docker/compose:1.27.4 --file docker-compose-all-roles-20210119-8b881fe_broker_with_healthcheck.yml up --scale worker=4
5bdfdf115d62e2ec4ff81ad9bf5363f4a0343207342e99cf4400dab602845b56
[deployment]$ docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
5bdfdf115d62 docker/compose:1.27.4 "sh /usr/local/bin/d…" 2 seconds ago Up 1 second magical_grothendieck
[deployment]$ docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
bfbd06d42779 93fa0c96b2b7 "docker-entrypoint.s…" 1 second ago Up Less than a second (health: starting) 4369/tcp, 0.0.0.0:5671->5671/tcp, 5672/tcp, 15672/tcp, 15691-15692/tcp, 25672/tcp, 0.0.0.0:15671->15671/tcp deployment_broker_1
ee0f8e38332e redis:6.0.10-alpine3.12 "docker-entrypoint.s…" 1 second ago Up Less than a second 6379/tcp deployment_persister_1
5bdfdf115d62 docker/compose:1.27.4 "sh /usr/local/bin/d…" 3 seconds ago Up 3 seconds magical_grothendieck
[deployment]$ docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
bfbd06d42779 93fa0c96b2b7 "docker-entrypoint.s…" 3 seconds ago Up 2 seconds (health: starting) 4369/tcp, 0.0.0.0:5671->5671/tcp, 5672/tcp, 15672/tcp, 15691-15692/tcp, 25672/tcp, 0.0.0.0:15671->15671/tcp deployment_broker_1
ee0f8e38332e redis:6.0.10-alpine3.12 "docker-entrypoint.s…" 3 seconds ago Up 2 seconds 6379/tcp deployment_persister_1
5bdfdf115d62 docker/compose:1.27.4 "sh /usr/local/bin/d…" 5 seconds ago Up 4 seconds magical_grothendieck
[deployment]$ docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
bfbd06d42779 93fa0c96b2b7 "docker-entrypoint.s…" 6 seconds ago Up 5 seconds (health: starting) 4369/tcp, 0.0.0.0:5671->5671/tcp, 5672/tcp, 15672/tcp, 15691-15692/tcp, 25672/tcp, 0.0.0.0:15671->15671/tcp deployment_broker_1
ee0f8e38332e redis:6.0.10-alpine3.12 "docker-entrypoint.s…" 6 seconds ago Up 5 seconds 6379/tcp deployment_persister_1
5bdfdf115d62 docker/compose:1.27.4 "sh /usr/local/bin/d…" 8 seconds ago Up 7 seconds magical_grothendieck
[deployment]$ docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
bfbd06d42779 93fa0c96b2b7 "docker-entrypoint.s…" 8 seconds ago Up 7 seconds (health: starting) 4369/tcp, 0.0.0.0:5671->5671/tcp, 5672/tcp, 15672/tcp, 15691-15692/tcp, 25672/tcp, 0.0.0.0:15671->15671/tcp deployment_broker_1
ee0f8e38332e redis:6.0.10-alpine3.12 "docker-entrypoint.s…" 8 seconds ago Up 7 seconds 6379/tcp deployment_persister_1
5bdfdf115d62 docker/compose:1.27.4 "sh /usr/local/bin/d…" 10 seconds ago Up 9 seconds magical_grothendieck
[deployment]$ docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
bfbd06d42779 93fa0c96b2b7 "docker-entrypoint.s…" 11 seconds ago Up 10 seconds (health: starting) 4369/tcp, 0.0.0.0:5671->5671/tcp, 5672/tcp, 15672/tcp, 15691-15692/tcp, 25672/tcp, 0.0.0.0:15671->15671/tcp deployment_broker_1
ee0f8e38332e redis:6.0.10-alpine3.12 "docker-entrypoint.s…" 11 seconds ago Up 10 seconds 6379/tcp deployment_persister_1
5bdfdf115d62 docker/compose:1.27.4 "sh /usr/local/bin/d…" 13 seconds ago Up 12 seconds magical_grothendieck
[deployment]$ docker logs 5bdfdf11
Creating network "deployment_wres_net" with driver "bridge"
Creating deployment_persister_1 ...
Creating deployment_broker_1 ...
[deployment]$ docker logs 5bdfdf11
Creating network "deployment_wres_net" with driver "bridge"
Creating deployment_persister_1 ...
Creating deployment_broker_1 ...
[deployment]$ docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
bfbd06d42779 93fa0c96b2b7 "docker-entrypoint.s…" 24 seconds ago Up 23 seconds (health: starting) 4369/tcp, 0.0.0.0:5671->5671/tcp, 5672/tcp, 15672/tcp, 15691-15692/tcp, 25672/tcp, 0.0.0.0:15671->15671/tcp deployment_broker_1
ee0f8e38332e redis:6.0.10-alpine3.12 "docker-entrypoint.s…" 24 seconds ago Up 23 seconds 6379/tcp deployment_persister_1
5bdfdf115d62 docker/compose:1.27.4 "sh /usr/local/bin/d…" 26 seconds ago Up 25 seconds magical_grothendieck
[deployment]$ docker logs 5bdfdf11
Creating network "deployment_wres_net" with driver "bridge"
Creating deployment_persister_1 ...
Creating deployment_broker_1 ...
[deployment]$ docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
bfbd06d42779 93fa0c96b2b7 "docker-entrypoint.s…" 29 seconds ago Up 28 seconds (health: starting) 4369/tcp, 0.0.0.0:5671->5671/tcp, 5672/tcp, 15672/tcp, 15691-15692/tcp, 25672/tcp, 0.0.0.0:15671->15671/tcp deployment_broker_1
ee0f8e38332e redis:6.0.10-alpine3.12 "docker-entrypoint.s…" 29 seconds ago Up 28 seconds 6379/tcp deployment_persister_1
5bdfdf115d62 docker/compose:1.27.4 "sh /usr/local/bin/d…" 31 seconds ago Up 30 seconds magical_grothendieck
The tasker and workers came up after it was truly up and good. Now looks like this:
[deployment]$ docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
a78f3e63fe83 nwcal-registry.[host]/wres/wres-worker:20210119-74698fa "./docker-entrypoint…" 2 minutes ago Up 2 minutes deployment_worker_1
ea39497e3a44 nwcal-registry.[host]/wres/wres-worker:20210119-74698fa "./docker-entrypoint…" 2 minutes ago Up 2 minutes deployment_worker_2
2f35fe726dda nwcal-registry.[host]/wres/wres-worker:20210119-74698fa "./docker-entrypoint…" 2 minutes ago Up 2 minutes deployment_worker_4
54e619385fc4 nwcal-registry.[host]/wres/wres-worker:20210119-74698fa "./docker-entrypoint…" 2 minutes ago Up 2 minutes deployment_worker_3
060c3a3faa23 nwcal-registry.[host]/wres/wres-tasker:20210114-7f78a93 "bin/wres-tasker" 2 minutes ago Up 2 minutes 0.0.0.0:443->8443/tcp deployment_tasker_1
bfbd06d42779 93fa0c96b2b7 "docker-entrypoint.s…" 3 minutes ago Up 3 minutes (healthy) 4369/tcp, 0.0.0.0:5671->5671/tcp, 5672/tcp, 15672/tcp, 15691-15692/tcp, 25672/tcp, 0.0.0.0:15671->15671/tcp deployment_broker_1
ee0f8e38332e redis:6.0.10-alpine3.12 "docker-entrypoint.s…" 3 minutes ago Up 3 minutes 6379/tcp deployment_persister_1
5bdfdf115d62 docker/compose:1.27.4 "sh /usr/local/bin/d…" 3 minutes ago Up 3 minutes magical_grothendieck
Original Redmine Comment Author Name: Jesse (Jesse) Original Date: 2021-01-25T23:18:20Z
In order to add the health check to the redis image we need to separately build a redis image that wraps the existing one, adds the health check. We should probably have it run as @wres_docker@ as well instead of whatever it currently does.
Original Redmine Comment Author Name: James (James) Original Date: 2021-01-25T23:54:58Z
On reflection, I wonder if it isn't better to avoid these dependencies being declared via docker-compose and any associated health conditionality. Each dependency should have it's own built-in resilience that should work regardless of the environment in which it is running and not only during service start-up. I believe this declaration is gone in the v3 docker-compose format anyway, so it won't port.
Original Redmine Comment Author Name: Jesse (Jesse) Original Date: 2021-01-26T15:03:37Z
I'll discuss why I disagree after I do the work.
Original Redmine Comment Author Name: Chris (Chris) Original Date: 2021-01-26T15:19:06Z
With the docker-compose spec version 2.1+, you can combine the @depends_on@ tag with the @healthcheck@ tag in your docker-compose.yml file to only launch service b after service a has started and has been declared healthy without having to write a wrapper image. This is usually the way to go with 3rd party images for finished products (redis, postgres, etc) since they have people focused on making sure that the image on its own is solid. It is then up to the docker-compose configuration to tie everything together, which is part of what it was made for. This somewhat limits the points of failure to misconfiguration and not in the image makeup. It's much easier to fix things like security gaps through smart deployment configuration rather than through static artifact generation.
This comment isn't intended to guide, just to highlight features available to us.
Original Redmine Comment Author Name: Chris (Chris) Original Date: 2021-01-26T15:29:24Z
Something I just caught - version 3+ of compose stripped conditional @depends_on@ clauses. An alternate approach there is a start of script that checks dependent containers' health before continuing. This approach is generally used for containers relying on postgres containers.
Original Redmine Comment Author Name: James (James) Original Date: 2021-01-26T15:47:14Z
Right, gone from 3+ compose file version. Might want to ask the question: why? (I don't know the answer, but we can search for their thinking... edit: but it's probably related to swarm and what resilience means in that new context).
But my version of the answer is that it's better to build resilience into the contained applications.
Startup covers only a fraction of failure modes. Another failure mode is that a container or containers or the contained applications within them die at runtime, in any number or sequence. If application B depends on application A and a startup health check enforces that B starts when A is healthy, but B is not otherwise resilient, then what happens when A dies at runtime? Or what happens when both A and B die at runtime and B restarts before A (does the docker-compose start-up sequence apply to subsets of containers at all times, including runtime, or only to all containers on start-up?). edit: or what happens when we deploy outside of docker?
Overall, the smell that I am trying to communicate is that these settings are as likely to obfuscate as assist - it is better for the applications to be resilient. The only resilience that containers should have is that they should restart upon failure. But, regarding the contained applications, if the worker starts before the eventsbroker or the graphics, that should be fine. If the eventsbroker starts after the worker and before the graphics and takes N minutes to become healthy, that should be fine. Etc. Regardless of docker.
Original Redmine Comment Author Name: Chris (Chris) Original Date: 2021-01-26T16:18:51Z
I'm too tired to string together the sentences that express how much I wholeheartedly agree, so here's a "link":https://vlab.noaa.gov/redmine/projects/owp-wres-gui/repository/revisions/master/entry/entrypoint.sh to the the GUI entrypoint script that seems to do what you're referencing. That doesn't poll and pause the app server when postgres is considered unhealthy, though.
Might want to ask the question: why? (I don't know the answer, but we can search for their thinking... edit: but it's probably related to swarm and what resilience means in that new context).
Docker has been trying to switch people away from compose and into swarm, even for single node swarms. Swarm naturally waits to direct traffic based on dependent health, so it was rendered moot. You can run into situations where you have 15 replicas with only 8 being healthy so it still has to manage that behavior. Health checks are really important to swarm (almost a requirement, really), so I'm not surprised that they had to play with it.
Original Redmine Comment Author Name: James (James) Original Date: 2021-01-26T16:28:26Z
Right - stuff like that. Build it into the contained applications, I say.
I am not against using additional docker tools if it smooths the way to something in docker or overcomes some bug (to be fixed) whereby the contained applications must start in a particular order, but I think that is a bug nonetheless - for any architecture composed of several microservices, you need to build in resilience out of the box before worrying about docker.
Makes sense re: swarm - I know nothing about it, but it makes sense that you don't send work to containers that are unhealthy (which is a separate issue from building order dependencies into restarts via docker - to be clear, health checks make perfect sense and we should have those).
Original Redmine Comment Author Name: Jesse (Jesse) Original Date: 2021-01-26T16:45:17Z
The docs seem to indicate that there is an indefinite start period by default, however when the container takes more than about a minute to start up, the default 3 health checks with the default 20 second retries indicate "unhealthy", causing the startup sequence to stop. Using 2.3 format to see if specifying "0" explicitly (it's supposed to be 0 by default, meaning indefinite, I thought), works.
Original Redmine Comment Author Name: Jesse (Jesse) Original Date: 2021-01-26T16:49:45Z
Confirmed by setting startup to 0, interval to 5s, timeout to 7s, retries to 2s, the unhealthy state comes more quickly in this case. So apparently the startup needs to be covered by the interval and retries as well.
Original Redmine Comment Author Name: Jesse (Jesse) Original Date: 2021-01-26T17:28:08Z
Trying a short interval and small retries with a non-zero startup_period to see if that properly separates startup from healthy status at runtime. Yes, that appears to work as expected.
Original Redmine Comment Author Name: Jesse (Jesse) Original Date: 2021-01-26T17:40:25Z
Kubernetes appears to nicely distinguish the aspects into startup, liveness, readiness: https://medium.com/avmconsulting-blog/how-to-perform-health-checks-in-kubernetes-k8s-a4e5300b1f9d
Original Redmine Comment Author Name: Jesse (Jesse) Original Date: 2021-01-26T17:50:57Z
Diff of seemingly working compose:
[deployment]$ diff docker-compose-all-roles-20210119-8b881fe.yml docker-compose-all-roles-20210119-8b881fe_with_healthchecks.yml
1c1
< version: '2.2'
---
> version: '2.3'
13c13
< image: "redis:6.0.10-alpine3.12"
---
> image: "9329653d0aaa"
17c17
< - /mnt/wres_share/job_data:/data
---
> - /mnt/wres_share/job_data2:/data
23a24,26
> # Allow 15+ minutes for startup before failing
> healthcheck:
> start_period: 15m
30,31c33,36
< - "broker"
< - "persister"
---
> broker:
> condition: service_healthy
> persister:
> condition: service_healthy
52c57
< image: "${DOCKER_REGISTRY}/wres/wres-broker:20210114-7f78a93"
---
> image: "93fa0c96b2b7"
64a70,72
> # Allow 5 minutes for startup before failing
> healthcheck:
> start_period: 5m
69c77,78
< - "broker"
---
> broker:
> condition: service_healthy
Original Redmine Comment Author Name: Jesse (Jesse) Original Date: 2021-01-26T21:45:51Z
The critical ones for 5.5 are ready (meaning ones that affect functionality of the software in certain cases), pending NCEP coming back online so I can commit them. Maybe the /home directory thing is better suited to #82962: use docker NFS volume or NFS from within the container. That should solve the "cannot access home after reboot" issue for the most part.
Original Redmine Comment Author Name: Jesse (Jesse) Original Date: 2021-01-27T21:18:45Z
commit:c8f3b13be164794f00e9c74646b59cf91378f9a1 has the added healthchecks for redis and broker. Does not have changes for home or eventsbroker. Does not yet include the wres-redis image in the @scripts/dockerize.sh@ image build tool.
Original Redmine Comment Author Name: Jesse (Jesse) Original Date: 2021-01-27T22:11:18Z
commit:a7eae95cb165fa1490ae1b0a8fa62961ed0e07d6 has updated @scripts/dockerize.sh@ with the redis image.
At least these two commits should make it into 5.5 but I don't know about the remainder.
Original Redmine Comment Author Name: Jesse (Jesse) Original Date: 2021-01-27T23:43:44Z
Launching now waits for the redis instance to read its aof.
[deployment]$ sudo docker run -d -v /var/run/docker.sock:/var/run/docker.sock -v "$PWD:$PWD" -w "$PWD" --cap-drop ALL --cpus 2 --memory 512M docker/compose:1.27.4 --file docker-compose-all-roles-20210127-be74051.yml up --scale worker=2
8c4f6d66197ff29903ceafa34c3c0ece3bc67176593406196a3a3f3428cc0572
[deployment]$ docker logs 8c4f6d6619
Creating deployment_persister_1 ...
Creating deployment_broker_1 ...
[deployment]$ docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
ad3f4ccc9184 nwcal-registry.[host]/wres/wres-broker:20210127-d64a19d "docker-entrypoint.s…" 6 seconds ago Up 5 seconds (health: starting) 4369/tcp, 0.0.0.0:5671->5671/tcp, 5672/tcp, 15672/tcp, 15691-15692/tcp, 25672/tcp, 0.0.0.0:15671->15671/tcp deployment_broker_1
44abf2d9f6e8 nwcal-registry.[host]/wres/wres-redis:20210127-d64a19d "docker-entrypoint.s…" 6 seconds ago Up 5 seconds (health: starting) 6379/tcp deployment_persister_1
8c4f6d66197f docker/compose:1.27.4 "sh /usr/local/bin/d…" 7 seconds ago Up 7 seconds youthful_keller
[deployment]$ docker logs 8c4f6d6619
Creating deployment_persister_1 ...
Creating deployment_broker_1 ...
[deployment]$ docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
ad3f4ccc9184 nwcal-registry.[host]/wres/wres-broker:20210127-d64a19d "docker-entrypoint.s…" 10 seconds ago Up 8 seconds (health: starting) 4369/tcp, 0.0.0.0:5671->5671/tcp, 5672/tcp, 15672/tcp, 15691-15692/tcp, 25672/tcp, 0.0.0.0:15671->15671/tcp deployment_broker_1
44abf2d9f6e8 nwcal-registry.[host]/wres/wres-redis:20210127-d64a19d "docker-entrypoint.s…" 10 seconds ago Up 8 seconds (health: starting) 6379/tcp deployment_persister_1
8c4f6d66197f docker/compose:1.27.4 "sh /usr/local/bin/d…" 11 seconds ago Up 10 seconds youthful_keller
[deployment]$ docker logs deployment_persister_1
1:C 27 Jan 2021 23:42:00.424 # oO0OoO0OoO0Oo Redis is starting oO0OoO0OoO0Oo
1:C 27 Jan 2021 23:42:00.424 # Redis version=6.0.10, bits=64, commit=00000000, modified=0, pid=1, just started
1:C 27 Jan 2021 23:42:00.424 # Configuration loaded
1:M 27 Jan 2021 23:42:00.428 * Running mode=standalone, port=6379.
1:M 27 Jan 2021 23:42:00.428 # WARNING: The TCP backlog setting of 511 cannot be enforced because /proc/sys/net/core/somaxconn is set to the lower value of 128.
1:M 27 Jan 2021 23:42:00.428 # Server initialized
1:M 27 Jan 2021 23:42:00.428 # WARNING overcommit_memory is set to 0! Background save may fail under low memory condition. To fix this issue add 'vm.overcommit_memory = 1' to /etc/sysctl.conf and then reboot or run the command 'sysctl vm.overcommit_memory=1' for this to take effect.
1:M 27 Jan 2021 23:42:00.428 # WARNING you have Transparent Huge Pages (THP) support enabled in your kernel. This will create latency and memory usage issues with Redis. To fix this issue run the command 'echo madvise > /sys/kernel/mm/transparent_hugepage/enabled' as root, and add it to your /etc/rc.local in order to retain the setting after a reboot. Redis must be restarted after THP is disabled (set to 'madvise' or 'never').
1:M 27 Jan 2021 23:42:00.432 * Reading RDB preamble from AOF file...
1:M 27 Jan 2021 23:42:00.432 * Loading RDB produced by version 6.0.8
1:M 27 Jan 2021 23:42:00.432 * RDB age 6731941 seconds
1:M 27 Jan 2021 23:42:00.432 * RDB memory usage when created 659.14 Mb
1:M 27 Jan 2021 23:42:00.432 * RDB has an AOF tail
1:M 27 Jan 2021 23:42:02.624 * Reading the remaining AOF tail...
[deployment]$ docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
ad3f4ccc9184 nwcal-registry.[host]/wres/wres-broker:20210127-d64a19d "docker-entrypoint.s…" 23 seconds ago Up 21 seconds (health: starting) 4369/tcp, 0.0.0.0:5671->5671/tcp, 5672/tcp, 15672/tcp, 15691-15692/tcp, 25672/tcp, 0.0.0.0:15671->15671/tcp deployment_broker_1
44abf2d9f6e8 nwcal-registry.[host]/wres/wres-redis:20210127-d64a19d "docker-entrypoint.s…" 23 seconds ago Up 21 seconds (health: starting) 6379/tcp deployment_persister_1
8c4f6d66197f docker/compose:1.27.4 "sh /usr/local/bin/d…" 24 seconds ago Up 23 seconds youthful_keller
Author Name: Jesse (Jesse) Original Redmine Issue: 85103, https://vlab.noaa.gov/redmine/issues/85103 Original Date: 2020-11-18
Docker health checks to ensure what you see in the below checklist
Redmine related issue(s): 85546, 85803