[BUG] Pot sometimes leaves mount-ins behind

grembo commented 1 year ago

Describe the bug When using pot with nomad, nomad's special directory mount-ins stay behind.

To Reproduce Run a basic nomad pot example (like nginx) and migrate it a couple of times (start/stop etc.).

After a while you will see something like this, even though only one container is running:

/var/tmp/nomad/alloc/3682d419-1017-3be9-5bae-11e4307ffee9/www1/local      26740536       816   26739720     0%    /opt/pot/jails/www1_1b46f3a2_3682d419-1017-3be9-5bae-11e4307ffee9/m/local
/var/tmp/nomad/alloc/3682d419-1017-3be9-5bae-11e4307ffee9/www1/secrets    26740536       816   26739720     0%    /opt/pot/jails/www1_1b46f3a2_3682d419-1017-3be9-5bae-11e4307ffee9/m/secrets
/var/tmp/nomad/alloc/b1f2fdba-21e7-f932-9f4b-b669fd81acde/www1/local      26740536       816   26739720     0%    /opt/pot/jails/www1_cd56d63b_b1f2fdba-21e7-f932-9f4b-b669fd81acde/m/local
/var/tmp/nomad/alloc/b1f2fdba-21e7-f932-9f4b-b669fd81acde/www1/secrets    26740536       816   26739720     0%    /opt/pot/jails/www1_cd56d63b_b1f2fdba-21e7-f932-9f4b-b669fd81acde/m/secrets
/var/tmp/nomad/alloc/f364feb0-c8af-b232-52b2-2bb3df073c20/www1/local      26740536       816   26739720     0%    /opt/pot/jails/www1_9eb3303e_f364feb0-c8af-b232-52b2-2bb3df073c20/m/local
/var/tmp/nomad/alloc/f364feb0-c8af-b232-52b2-2bb3df073c20/www1/secrets    26740536       816   26739720     0%    /opt/pot/jails/www1_9eb3303e_f364feb0-c8af-b232-52b2-2bb3df073c20/m/secrets
/var/tmp/nomad/alloc/8dd36275-c514-be0a-4925-efdd1447e5ec/www1/local      26740536       816   26739720     0%    /opt/pot/jails/www1_3a708d45_8dd36275-c514-be0a-4925-efdd1447e5ec/m/local
/var/tmp/nomad/alloc/8dd36275-c514-be0a-4925-efdd1447e5ec/www1/secrets    26740536       816   26739720     0%    /opt/pot/jails/www1_3a708d45_8dd36275-c514-be0a-4925-efdd1447e5ec/m/secrets
/var/tmp/nomad/alloc/04c43aae-b5b9-1fe1-b2d4-9bcbd39b6997/www1/local      26740536       816   26739720     0%    /opt/pot/jails/www1_2db4780c_04c43aae-b5b9-1fe1-b2d4-9bcbd39b6997/m/local
/var/tmp/nomad/alloc/04c43aae-b5b9-1fe1-b2d4-9bcbd39b6997/www1/secrets    26740536       816   26739720     0%    /opt/pot/jails/www1_2db4780c_04c43aae-b5b9-1fe1-b2d4-9bcbd39b6997/m/secrets
/var/tmp/nomad/alloc/32f9430c-5392-ebf2-609a-60b57ba3eae5/www1/local      26740536       816   26739720     0%    /opt/pot/jails/www1_4e4600ed_32f9430c-5392-ebf2-609a-60b57ba3eae5/m/local
/var/tmp/nomad/alloc/32f9430c-5392-ebf2-609a-60b57ba3eae5/www1/secrets    26740536       816   26739720     0%    /opt/pot/jails/www1_4e4600ed_32f9430c-5392-ebf2-609a-60b57ba3eae5/m/secrets

Expected behavior No leftover mounts

Additional context My suspicion is that umounts fail when the jail stops (maybe due to some processes still using the mountpoint). Later the ZFS filesystem is purged. Normal manual umount of these mounts work ok.

grembo commented 10 months ago

It seems like this depends on the order in which nomad-pot-driver issues certain commands:

Example of a command sequence that left mounts behind:

2023-12-21T11:38:25+00:00 10.20.20.231 pot[42497]: pot-destroy -p myservice_fdf1f644_ad0150ca-d40b-f752-b564-8fe4d86c657e myservice -F 
2023-12-21T11:38:25+00:00 10.20.20.231 pot[42476]: pot-set-status -p myservice_fdf1f644_ad0150ca-d40b-f752-b564-8fe4d86c657e -s stopped
2023-12-21T11:38:24+00:00 10.20.20.231 pot[40184]: pot-destroy -p myservice_fdf1f644_ad0150ca-d40b-f752-b564-8fe4d86c657e -F
2023-12-21T11:38:24+00:00 10.20.20.231 pot[40017]: pot-set-status -p myservice_fdf1f644_ad0150ca-d40b-f752-b564-8fe4d86c657e -s stopping
2023-12-21T11:38:18+00:00 10.20.20.231 pot[39032]: pot-set-status -p myservice_fdf1f644_ad0150ca-d40b-f752-b564-8fe4d86c657e -s stopping
2023-12-21T11:38:18+00:00 10.20.20.231 pot[38992]: pot-stop myservice_fdf1f644_ad0150ca-d40b-f752-b564-8fe4d86c657e myservice

Two things are of interest here:

See how stopping is set twice
The stopped status is set, after the first destroy command is launched
Destroy is also issued a second time
Extra parameters to certain commands (addressed in bsdpot/nomad-pot-driver#49).

So it looks like, the pot is stopped and destroyed twice. The second stopping call is after 5s, which looks like a nomad timeout. So the solution for this might be inside nomad, but it also feels like there's a lack of locking involved, being able to call stop and destroy multiple times in parallel.

grembo commented 5 months ago

It seems like this was caused by prometheus node-exporter running on the host. Excluding pot file systems solved the issue.

For reference:

service node_exporter enable
sysrc node_exporter_user=nodeexport
sysrc node_exporter_group=nodeexport
sysrc node_exporter_listen_address="127.0.0.1:9100"
echo '--log.level=warn
--collector.filesystem.mount-points-exclude=^/(dev|opt)($|/)
--collector.filesystem.fs-types-exclude=^(devfs|nullfs)$
--collector.netdev.device-exclude=^(p4|epair)' \
  >/usr/local/etc/node_exporter_args
sysrc node_exporter_args="@/usr/local/etc/node_exporter_args"

bsdpot / pot

[BUG] Pot sometimes leaves mount-ins behind #272