Closed EugenKon closed 2 months ago
Hi @EugenKon and thanks for raising this issue. Is the issue that the included template is not being rendered as you expect (it's a little hard to understand)? If so, can you provide any logs to help look at what the client does as well as a reproduction job specification? The consul-template service function has a large number of options which may be needed depending on your setup.
The screenshots show the reload script exited with code 0, indicating success and that NGINX has an acceptable configuration. That show the previous template was successfully rendered prior to executing it.
@jrasell It seems I found the problem. The file on the host system is updated, but it is not visible from the container.
less /data/nomad/client/alloc/d27f63e2-35da-917d-862d-a1850fae97db/wi-nginx-task/custom/walk-inside.com.nginx.conf
docker exec -it wi-nginx-task-d27f63e2-35da-917d-862d-a1850fae97db less /tmp/nginx/sites-enabled/walk-inside.com
So I just checked if I manually change the configuration file for Nginx on a host system, then this new content is not visible from the Nginx container.
Probably the problem is rprivate
option used to mount the nginx configuration into the container.
docker --version
Docker version 27.1.1, build 6312585
This looks like the old bug in Docker https://github.com/docker/for-win/issues/5530, but it was fixed. Probably it could be related to Ubuntu 24.04 )
Linux ip-172-31-0-179 6.8.0-1012-aws #13-Ubuntu SMP Mon Jul 15 13:40:27 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux
@EugenKon looks like you've got it figured out. Just FYI, instead of volumes
you can use the mount
block to have fine-grained control over things like mount propagation.
@tgross Finally I managed to minimize the experiment and show that this is a Nomad bug which is critical and easy to fix: Just do not recreate the file. It should be rewritten instead.
From below you can see that the rendered template is mounted into container. The inode for both files are the same. Changes to the file from inside container are visible on the host and changes to the file from the host are visible from inside container. But when Nomad notices changes it rerenders template and creates the new file on the host system. But this new file is not mounted into container. Notice that inode on the host system is changed. This resulting to have two different files:
From CONTAINER
6581921960ca:/opt# stat zz
File: zz
Size: 37 Blocks: 8 IO Block: 4096 regular file
Device: 10301h/66305d Inode: 3825587 Links: 1
Access: (0644/-rw-r--r--) Uid: ( 0/ root) Gid: ( 0/ root)
Access: 2024-08-09 15:50:56.926060691 +0000
Modify: 2024-08-09 15:50:54.300050975 +0000
Change: 2024-08-09 15:50:54.303050986 +0000
6581921960ca:/opt# cat zz
upstream server 172.31.10.241:24826;
6581921960ca:/opt# date; cat zz; date; echo "INSIDE" >> zz; cat zz; date;
Fri Aug 9 15:58:10 UTC 2024
upstream server 172.31.10.241:24826;
HOST
Fri Aug 9 15:58:10 UTC 2024
upstream server 172.31.10.241:24826;
HOST
INSIDE
Fri Aug 9 15:58:10 UTC 2024
6581921960ca:/opt# stat zz
File: zz
Size: 49 Blocks: 8 IO Block: 4096 regular file
Device: 10301h/66305d Inode: 3825587 Links: 0
Access: (0644/-rw-r--r--) Uid: ( 0/ root) Gid: ( 0/ root)
Access: 2024-08-09 15:58:12.065860201 +0000
Modify: 2024-08-09 15:58:10.296851206 +0000
Change: 2024-08-09 15:58:13.464867313 +0000
6581921960ca:/opt# cat zz
upstream server 172.31.10.241:24826;
HOST
INSIDE
From HOST
root@ip-172-31-10-241:/data/nomad/client/alloc/b80c8922-4e84-32fb-68ab-fbf6c56c31ff/test/custom# stat zz
File: zz
Size: 37 Blocks: 8 IO Block: 4096 regular file
Device: 259,1 Inode: 3825587 Links: 1
Access: (0644/-rw-r--r--) Uid: ( 0/ root) Gid: ( 0/ root)
Access: 2024-08-09 15:50:56.926060691 +0000
Modify: 2024-08-09 15:50:54.300050975 +0000
Change: 2024-08-09 15:50:54.303050986 +0000
Birth: 2024-08-09 15:50:54.300050975 +0000
root@ip-172-31-10-241:/data/nomad/client/alloc/b80c8922-4e84-32fb-68ab-fbf6c56c31ff/test/custom# cat zz
upstream server 172.31.10.241:24826;
root@ip-172-31-10-241:/data/nomad/client/alloc/b80c8922-4e84-32fb-68ab-fbf6c56c31ff/test/custom# date; echo "HOST" >> zz; cat zz; date;
Fri Aug 9 15:58:09 UTC 2024
upstream server 172.31.10.241:24826;
HOST
Fri Aug 9 15:58:09 UTC 2024
root@ip-172-31-10-241:/data/nomad/client/alloc/b80c8922-4e84-32fb-68ab-fbf6c56c31ff/test/custom# date; cat zz; date;
Fri Aug 9 15:58:12 UTC 2024
upstream server 172.31.10.241:24826;
HOST
INSIDE
Fri Aug 9 15:58:12 UTC 2024
root@ip-172-31-10-241:/data/nomad/client/alloc/b80c8922-4e84-32fb-68ab-fbf6c56c31ff/test/custom# stat zz
File: zz
Size: 37 Blocks: 8 IO Block: 4096 regular file
Device: 259,1 Inode: 3825588 Links: 1
Access: (0644/-rw-r--r--) Uid: ( 0/ root) Gid: ( 0/ root)
Access: 2024-08-09 15:58:23.468928499 +0000
Modify: 2024-08-09 15:58:13.461867298 +0000
Change: 2024-08-09 15:58:13.464867313 +0000
Birth: 2024-08-09 15:58:13.461867298 +0000
root@ip-172-31-10-241:/data/nomad/client/alloc/b80c8922-4e84-32fb-68ab-fbf6c56c31ff/test/custom# cat zz
upstream server 172.31.10.241:24826;
As temporary workaround we should not mount a rendered template file into container, instead we should mount folder into container and render a template into that folder.
Finally I managed to minimize the experiment and show that this is a Nomad bug which is critical and easy to fix: Just do not recreate the file. It should be rewritten instead.
The file write is happening via the embedded consul-template
. What you're seeing is not a bug but intentional, because it's the only way to ensure that a write is atomic. See https://github.com/hashicorp/consul-template/issues/1410 for another example of someone reporting this non-bug behavior.
Your workaround described in https://github.com/hashicorp/nomad/issues/23691#issuecomment-2278753245 is almost certainly how you should be handling this kind of issue.
Nomad version
Operating system and Environment details
Issue
Reproduction steps
Expected Result
configuration should contain correct configuration, eg. configured upstream
Actual Result
generated configuration does not have expected upstream, but the service is up and healthy. The suspicious part is that I do not see that template was rerendered. When earlier it was
change_mode="signal"
I believe I saw messages like:tempalte was rerendered
.Job file (if appropriate)
Temporary workaround
docker stop wi-nginx-task-xxx