synpse-hq / synpse

Synpse is an all-in-one solution to manage your servers and IoT devices providing declarative app deployment, SSH access and TCP tunnels
https://synpse.com
Apache License 2.0
20 stars 3 forks source link

AGENT_IMAGE_GC_AGE doesn't remove old docker images #31

Closed hrfuller closed 1 year ago

hrfuller commented 1 year ago

We are having disk space issues on our hosts from old docker images not getting purged. From a thread in discord it sounds like the default GC age is 48hrs but we are seeing images that are over a week old on some hosts. I recently updated this to 6h but see no change in the age of older docker images on the host.

Here is the system info for our hosts:

$ docker version
Client:
 Version:           20.10.12
 API version:       1.41
 Go version:        go1.16.2
 Git commit:        20.10.12-0ubuntu2~20.04.1
 Built:             Wed Apr  6 02:16:12 2022
 OS/Arch:           linux/arm64
 Context:           default
 Experimental:      true

Server:
 Engine:
  Version:          20.10.12
  API version:      1.41 (minimum version 1.12)
  Go version:       go1.16.2
  Git commit:       20.10.12-0ubuntu2~20.04.1
  Built:            Thu Feb 10 15:03:35 2022
  OS/Arch:          linux/arm64
  Experimental:     false
 containerd:
  Version:          1.5.9-0ubuntu1~20.04.4
  GitCommit:
 nvidia:
  Version:          1.1.0-0ubuntu1~20.04.1
  GitCommit:        629a689
 docker-init:
  Version:          0.19.0
  GitCommit:
$ uname -a
Linux dev1 5.10.104-tegra #1 SMP PREEMPT Thu Sep 8 16:22:59 CST 2022 aarch64 aarch64 aarch64 GNU/Linux
$ lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description:    Ubuntu 20.04.4 LTS
Release:    20.04
Codename:   focal

Here is the updated synpse-agent.service:

[Unit]
Description=Synpse agent
Documentation=https://synpse.net/docs/
Wants=network-online.target

[Install]
WantedBy=multi-user.target

[Service]
KillMode=process
Delegate=yes
LimitNOFILE=infinity
LimitNPROC=infinity
LimitCORE=infinity
TasksMax=infinity
TimeoutStartSec=0
Restart=always
RestartSec=5s
ExecStart=/usr/local/bin/synpse-agent run
Environment=AGENT_IMAGE_GC_AGE="6h"
mjudeikis commented 1 year ago

Im just checking this, and by any chance you had used the same image outside synpse before on the same device? Currently, we do not force deletion if the image is used. In this case, if there are containers already present (stopped or failed) it will not purge it once 6h passed.

In example:

docker ps -a | grep <image-name>

Wondering if this might be the case.

Still looking but want to make sure this is not the case

hrfuller commented 1 year ago

No, we don't start the images in question outside of synpse. I manually purged the old images to free up space so that would have deleted any stopped containers. However I'm very confident that we aren't running the images in question outside of synpse.

hrfuller commented 1 year ago

Hey @mjudeikis I was wondering if you had a chance to dig into this. I realize a repro may be difficult, but it would be good to know at least if its a known issue in synpse or just on our setup.

rusenask commented 1 year ago

Hi, I will check this out too. Do the image tags change? Or is it the same tag being updated?

mjudeikis commented 1 year ago

We added additional "force" purge option to agent:

    AGENT_IMAGE_GC_FORCE=true
    AGENT_IMAGE_GC_AGE=24h

We notice docker sometimes prevent GC due to "image being used". And its quite inconvenient to identify why on each case. In this mode it will purge images not sure by Synpse with "force" flag.

If you can try this and let us know if it helps?

hrfuller commented 1 year ago

Hi, I will check this out too. Do the image tags change? Or is it the same tag being updated?

The images all have a unique tag.

hrfuller commented 1 year ago

If you can try this and let us know if it helps?

I tried this and it hasn't seemed to do anything. However I noticed that the synpse-agent version running on our hosts seems old.

Synpse agent version: "0.21.14" build time: 2023-02-13T141031Z commit: 29562c744

We may not have up to date changes to the agent that support this new environment. Should synpse update itself automatically or do I need to reinstall the agent at a newer version?

mjudeikis commented 1 year ago

You should be able to trigger agent update via UI agent view:

Screenshot 2023-06-05 at 17 50 16 Screenshot 2023-06-05 at 17 50 30

Try bumping to 0.21.19 as it was released in that version

hrfuller commented 1 year ago

Just stumbled on this. Thanks.

hrfuller commented 1 year ago

I made the environment changes for the agent and updated to the latest release, then restarted the agent. I still see an image that is 11 days old, and it isn't the one the synpse application is using. How often will synpse try to clean up old images?

mjudeikis commented 1 year ago

If your image age is set for 24h, wait 24h and please check again. Process indexes those images and removes "if not used in last set window" but not based in image age, as image age in docker is not something you can rely on. If possible not restart agent for 24h?

If this still an issue after that, I think we will try one more change on the backend to add different purge logic.

mjudeikis commented 1 year ago

if issue still persist, please do docker inspect and send image which is not being deleted output to hello@synpse.net

hrfuller commented 1 year ago

I believe that this works now. I will re-open if I see anything to the contrary. Thanks for the help.

mjudeikis commented 1 year ago

Thanks!!! And sorry it took a long time.