sabeechen / hassio-google-drive-backup

Automatically create and sync Home Assistant backups into Google Drive
MIT License
3.19k stars 197 forks source link

Leftover images from upgrades showing in Portainer #328

Closed rpitera closed 3 years ago

rpitera commented 3 years ago

When I was browsing in Portainer today, I noticed a number of older version images for GDB showing unused.

Clipboard01

Can these be safely removed? Should they normally have been removed during the upgrade process? It's a considerable amount of space being taken up and I'd like to free it up.

System Health

version 2021.1.1
installation_type Home Assistant OS
dev false
hassio true
docker true
virtualenv false
python_version 3.8.7
os_name Linux
os_version 5.4.86
arch x86_64
timezone America/New_York
Home Assistant Community Store GitHub API | ok -- | -- Github API Calls Remaining | 4332 Installed Version | 1.9.0 Stage | running Available Repositories | 713 Installed Repositories | 56
AccuWeather can_reach_server | ok -- | -- remaining_requests | 46
Home Assistant Cloud logged_in | true -- | -- subscription_expiration | January 16, 2021, 7:00 PM relayer_connected | true remote_enabled | true remote_connected | true alexa_enabled | true google_enabled | true can_reach_cert_server | ok can_reach_cloud_auth | ok can_reach_cloud | ok
Hass.io host_os | Home Assistant OS 5.10 -- | -- update_channel | stable supervisor_version | 2020.12.7 docker_version | 19.03.13 disk_total | 30.8 GB disk_used | 21.2 GB healthy | true supported | true board | ova supervisor_api | ok version_api | ok installed_addons | Terminal & SSH (8.10.0), ADB - Android Debug Bridge (0.6.3), ESPHome (1.15.3), Visual Studio Code (2.9.1), Let's Encrypt (4.11.0), Node-RED (7.2.11), Home Assistant Google Drive Backup (0.103.0), MariaDB (2.2.1), Check Home Assistant configuration (3.6.0), Glances (0.9.1), Bitwarden RS (0.6.2), Portainer (1.3.0), Tautulli (1.5.1), phpMyAdmin (0.1.4), ozwcp (1.1.2), rest980 Docker Image (20200205), php-nginx Docker Image (latest), Eufy Home Assistant MQTT Bridge (1.1.0)
Lovelace dashboards | 2 -- | -- mode | yaml views | 27 resources | 45
sabeechen commented 3 years ago

Its my understanding that yes, they should be removed. I suspect you can remove them manually, but make sure you have a good snapshot "just in case" before you do as the supervisor uses Docker in mysterious ways and it might be intentional. Its easy to accidentally to mess up the supervisor's delicate environment through portainer and usually difficult to fix it.

I'll need to do a little more digging to figure out if why those old images aren't getting cleaned up or if its intentional.

rpitera commented 3 years ago

Thanks! Would leaving things intact for the moment help you to figure out what's going on or why it happened? I'm not currently having a space crisis and I'd hate to delete anything if the evidence would help you.

res2cpu commented 3 years ago

Good Afternoon,

I had the same issue, noticed after I got a low storage notification. I run a 10Gb disk on Proxmox so they were using nearly 10% of my storage.

Just to give it a test I done a full backup and then removed the old images. Everything still runs without problems after removing.

Thought I would comment as an FYI just in case anyone else finds their way here.

rpitera commented 3 years ago

Thanks @res2cpu! I figured as much, but I'm still holding out until I hear from @sabeechen because I don't want to destroy any evidence that might help him figure out what happened.

sabeechen commented 3 years ago

Go ahead and get rid of them (again take a snapshot beforehand just in case) as I don't think I'll need them to help debugging. The logs that could potentially help are almost certainly months old and long since deleted. I'm going to need to find a way to reproduce this locally to have any hope of solving it.

rpitera commented 3 years ago

OK, thanks - just didn't want to make your job any harder if I could help it. Cheers. If you need anything from me, just reply in this issue and I'll get back to you right away.

sabeechen commented 3 years ago

I can see in the logs of my local supervisor that it complains when I update the addon:

21-01-20 21:32:14 WARNING (SyncWorker_1) [supervisor.docker.interface] Can't find sabeechen/hassio-google-drive-backup-amd64 for cleanup

So something is definitely wrong, and I'm able to reproduce this by creating a dummy addon with the same docker images as the "official" addon. But why does it happen? The image is there. More digging.

res2cpu commented 3 years ago

I think it might be trying to clean-up something that didn't exist, from the image on your first post it seems to have version numbers as part of the name.

Maybe the clean-up script is not factoring that in.

sabeechen commented 3 years ago

The code where it the supervisor does the removal does include the version, it just doesn't get printed out in the logs. Unfortunately it also seems to eat all the other error information that might otherwise help debug. I've reached out to the supervisor people to see if they have any recommendation to debug this further, since this looks like it might be a supervisor bug.

res2cpu commented 3 years ago

Interesting. Well in any case if I can be of any help in testing let me know, and also thank you for your work on this plugin. It makes snapshot handling so much easier!

Given better times I would have already sent you a coffee or two.

rpitera commented 3 years ago

Just for the record, it's not only your add on - I found more than a few add ons that I tried once and decided they weren't for me left behind images. Clearing these unused images (after CAREFULLY tracking what they actually were and doing a full snapshot prior) has not only reclaimed the space, but HA feels extremely responsive now. Super fast. I still have a couple in there but since I can't verify exactly what they are or whether they are not in use simply because another add on uses them only when running, I've left them in place.

But I get the feeling that this is indeed either a supervisor bug or the result of something that wasn't documented/explained correctly in the add on library dev docs.

sabeechen commented 3 years ago

I can see some of that on my dev machine too. It seems to often keep the previous version of an image around (but not always?) and I can see some much older ones too.

sabeechen commented 3 years ago

@frenck tells me its a problem with how I have the addon's container images configured. I should be setting legacy: true in the addon's config to make the old images get cleaned up. Once I do some verification that flipping on that option won't cause problems, I'll make a new version with it enabled.

Remains to be seen if this will clean up old images that already exist. There might not be an easy way to make the sueprvisor recover the space users have already lost.

frenck commented 3 years ago

it doesn't clean up old images that already exist, unfortunately. The supervisor has no link with them.

sabeechen commented 3 years ago

Thats a bummer, but thanks for letting me know. Anyone whose used this addon for a while is likely to have many GB of old images and I'm hesitant to recommend that anyone install portainer and muck about with unused images. At least things will be fixed going forward. Doesn't feel good to make 1000's of new users lose precious SD card space :(

I'm also going to make a PR to include some mention of the impact of the legacy config option in the addon dev docs here, since I thought I was being very careful in how I configured and built my addon but I still managed to looked over this for several years.

res2cpu commented 3 years ago

Could another add-on like the home assistant config checker be made in order to look for "unused" containers. The image on the first post seems to have them labeled by portana as unused. I don't know it flags them.

rpitera commented 3 years ago

I noticed this behavior with a number of other addons, even including one of frenck's old unused VSCode images. I think it is a great idea to mention this in the addon dev docs to make sure they are aware of it. Does removing the images in portainer reclaim the space?

frenck commented 3 years ago

@rpitera Yes, it reclaims the free space.

I think it is a great idea to mention this in the addon dev docs to make sure they are aware of it

It is written in the dev docs.

rpitera commented 3 years ago

@frenck Thanks for the confirmation, Franck! @sabeechen You can close this whenever you feel it's been addressed; I'm good on my end. Thanks again.

sabeechen commented 3 years ago

I'll leave this open until I've pushed a version that fixes it moving forward (Its just the policy I use for issues that need code changes).

sabeechen commented 3 years ago

I ran into some trouble enabling the 'legacy': true option for the addon. When updating from a version without the tag to one with, it would install the new addon and leave the old one around unused (which is undesireable but expected). However when updating to the next version (also with 'legacy': true), it would download the new image then delete all images for the addon as part of its cleanup including the latest image it just downloaded. The addon would then obviously fail to start because the image was deleted. I'm unsure what to make of that behavior.

I've opted to just get the official builder working, which adds the tags the supervisor is expecting. In the manual testing I've done on a dummy addon not only does this upgrade as expected, but it also seems to clean up the old addon images even though they don't have the approriate tags. This means users will (hopefully) have all their old unused images cleaned up and the wasted disk space recovered as soon as they upgrade without any other action.

Note: I'm not planning to make changes anywhere to the addon dev docs about the legacy option since after playing with it a little bit I'm not confident I can speak accurately to what it does.

rpitera commented 3 years ago

Thanks for keeping on this, Stephen. Since I have been doing my 'cleanup' work with other 'dead' images from other addons, I've been seeing a real measurable increase in responsiveness. Most images are obvious in their labelling but I'm still trying to track one down that's simply labeled "node:boron". I'm sure it's for some image I uninstalled but I can't figure out what.

For sure, your addon is not the only one exhibiting this behavior and I'm wondering how many users out there have a few unused images taking up space and resources.

sabeechen commented 3 years ago

I just released v0.103.1, which includes this fix. If anyone continues to see any old unused snapshots from this addon after updating please reopen this bug and let me know, I'll dig into it further.