jcwillox / hass-auto-backup

🗃️ Improved Backup Service for Home Assistant that can Automatically Remove Backups and Supports Generational Backup Schemes.
https://jcwillox.github.io/hass-auto-backup
MIT License
346 stars 24 forks source link

Auto purge not working on expired backups #9

Open kaizersoje opened 4 years ago

kaizersoje commented 4 years ago

I have set auto_purge to true. However, I have noticed that expired backups are not getting automatically purged. Not sure what's wrong.

jcwillox commented 4 years ago

Likely, the backups are not being monitored for deletion. Backups will only be monitored if you set the keep_days parameter to a value > 0 in the service call. You can check if any backups are being monitored using either the sensor sensor.auto_backup which will show the number of monitored backups or by looking at the file /config/.storage/auto_backup.snapshots_expiry

Also, keep in mind that setting auto_purge to true doesn't mean expired backups are remove immediately when they expire, they will only be removed (purged) when creating a new backup via a service call to auto_backup (i.e not through the web UI). They will also be purged when calling the auto_backup.purge service.

kaizersoje commented 4 years ago

I have set the keep_days parameter in my automations. I checked out the sensor sensor.auto_backup and found that monitored_snapshots is 0 and purgeable_snapshots is 0. What's wrong?

jcwillox commented 4 years ago

Could you provide a copy of the service call your making in your automation?

Can you call the service below from the Developer Tools > Services section.

service: auto_backup.snapshot_partial
service_data:
  name: Test
  keep_days: 0.0014

After calling that service, check the sensor to see if monitored_snapshots has changed.

Also if you can provide the logs created when calling the service that would be great. To enable debug logging use the logger component. Below is an example to enable debug logging for auto_backup, you can just paste that into your configuration.yaml.

logger:
  default: warning
  logs:
    custom_components.auto_backup: debug
kaizersoje commented 4 years ago

I have provided a copy of my service call below

service: auto_backup.snapshot_full
    data_template:
      name: "Full AutoBackup: {{ now().strftime('%a %d/%m/%y') }}"
      keep_days: 7
kaizersoje commented 4 years ago

I tried the service call below

service: auto_backup.snapshot_partial
service_data:
  name: Test
  keep_days: 0.0014

I can see that the monitored_snapshots is now showing 1

kaizersoje commented 4 years ago

purgeable_snapshots is showing up as 1 too

jcwillox commented 4 years ago

Well sounds like everything is working correctly, you can now either call auto_backup.purge to remove the expired backup, or when you create your next backup it will be removed.

There may just be an issue with your automations. Can you confirm that your automations are valid and are triggering correctly? Try running 'Check Configuration' to make sure your YAML is valid.

This is an example of a valid automation that you could put in your automations.yaml file. keep_days is set to 2 minutes to make testing easier.

- alias: Perform Daily Backup
  trigger:
    platform: time
    at: "02:30:00"
  action:
    service: auto_backup.snapshot_full
    data_template:
      name: "DailyBackup: {{ now().strftime('%A, %B %-d, %Y') }}"
      keep_days: 0.0014
kaizersoje commented 4 years ago

My backups are triggering correctly. Below are the 2 automations I have

- alias: Perform Full Backup
  trigger:
    platform: time
    at: "01:30:00"
  condition:
    condition: time # Perform backup on Mondays.
    weekday:
      - mon
      - thu
  action:
    service: auto_backup.snapshot_full
    data_template:
      name: "Full AutoBackup: {{ now().strftime('%a %d/%m/%y') }}"
      keep_days: 7

- alias: Perform Partial Backup
  trigger:
    platform: time
    at: "01:30:00"
  condition:
    - condition: time
      weekday:
        - tue
        - wed
        - fri
        - sat
        - sun
  action:
    service: auto_backup.snapshot_partial 
    data_template:
      name: "Partial AutoBackup: {{ now().strftime('%a %d/%m/%y') }}"
      folders:
        - homeassistant
        - share
        - ssl
      keep_days: 2
jcwillox commented 4 years ago

Yep, those look fine, try manually triggering one of the automations from the web UI (wait for it to finish) and watch the Auto Backup sensor.

Also, it doesn't look like your automations clash, so this shouldn't apply to your situation, but I still want to mention that Auto Backup can (currently) only perform one snapshot at a time, if you try to create another snapshot before the other one has finished, Hass.io may still create the second snapshot but it will not be monitored by Auto Backup.

kaizersoje commented 4 years ago

I manually triggered and the automation successfully ran and created the backup. However, monitored_snapshots is still showing up as 0.

kaizersoje commented 4 years ago

I saw this notification this morning

Snapshot Failed. Name: Partial AutoBackup: Sat 14/03/20 Error: There may be a backup already in progress.

I am not sure how this is possible.

jcwillox commented 4 years ago

Yeah, I'm not sure either. That error means the request to the Hass.io API was successful, but the response from Hass.io was not valid, i.e it didn't contain the identifier for the snapshot. Hass.io may have created the backup but some error happened and it didn't give Auto Backup the ID.

Usually, this is caused when another backup is currently in progress, so naturally, make sure there are no other automations or addons that could be creating a backup at that time.

Did you just add that notification automation, or has that been present the whole time?

To help you any further I'm going to need those debug logs that I mentioned earlier, so I can properly see what's going on.

kaizersoje commented 4 years ago

I have attached the logs below from last night.

The backup Full AutoBackup: Mon 16/03/20 was created successfully. However, I noticed this in my notifications too, which I think was created around 1.50AM.

Snapshot Failed.
Name: Full AutoBackup: Mon 16/03/20
Error:
2020-03-16 01:30:00 DEBUG (MainThread) [custom_components.auto_backup] Creating snapshot Full AutoBackup: Mon 16/03/20
2020-03-16 01:30:00 DEBUG (MainThread) [custom_components.auto_backup] New snapshot; command: /snapshots/new/full, keep_days: 7.0, data: {'name': 'Full AutoBackup: Mon 16/03/20'}, timeout: 1200
2020-03-16 01:50:00 ERROR (MainThread) [custom_components.auto_backup] Timeout on /snapshots/new/full request
2020-03-16 01:50:00 ERROR (MainThread) [custom_components.auto_backup] Error during backup. 
2020-03-16 01:50:00 DEBUG (MainThread) [custom_components.auto_backup] No snapshots required purging.
jcwillox commented 4 years ago

Thanks! This makes it a lot clearer.

The issue is due to a timeout error. Auto Backup waits by default 20 minutes for a snapshot to complete, if it takes longer than that, it is assumed the snapshot failed.

This is usually caused by taking large backups. There is a configuration option to increase this timeout backup_timeout. This is very likely the cause as Hass.io would still create the backup but it will not be monitored. I was actually thinking yesterday that this may have been the issue.

Try taking a manual full backup and timing how long it takes then add like 20 minutes.

auto_backup:
  auto_purge: true
  backup_timeout: 2400  # 40 minutes, needs to be converted to seconds.
kaizersoje commented 4 years ago

I increased the timeout time to 40 minutes as the full backup took around 21 minutes.

2020-03-18 01:30:00 DEBUG (MainThread) [custom_components.auto_backup] Creating snapshot Partial AutoBackup: Wed 18/03/20
2020-03-18 01:30:00 DEBUG (MainThread) [custom_components.auto_backup] New snapshot; command: /snapshots/new/partial, keep_days: 2.0, data: {'name': 'Partial AutoBackup: Wed 18/03/20', 'folders': ['homeassistant', 'share', 'ssl']}, timeout: 2400
2020-03-18 02:10:00 ERROR (MainThread) [custom_components.auto_backup] Timeout on /snapshots/new/partial request
2020-03-18 02:10:00 ERROR (MainThread) [custom_components.auto_backup] Error during backup. 
2020-03-18 02:10:00 DEBUG (MainThread) [custom_components.auto_backup] No snapshots required purging.

At 40 minutes, the snapshot creation timed out. However, I can see that the Partial Backup Partial AutoBackup: Wed 18/03/20 was still created. I am getting quite confused now.

jcwillox commented 4 years ago

Well, this has me very confused, I've tried to do some digging into the snapshots API and can't find anything to explain why this is occurring. How big are you full/partial snapshots? Was the partial snapshot created after the timeout, i.e did it take over 40 minutes to create?

What is seems like is that the API is waiting forever and not returning a response, so maybe it's some kind of connection issue?

kaizersoje commented 4 years ago

Yes, the backups are getting created. I increased the timeout to 50 minutes but still the same issue. Full backup is a little under 1GB. Partial backup is just under 400MB.

jcwillox commented 4 years ago

My best guess at this point considering when you did an empty partial backup everything worked correctly, is that due to their size and possibly your hardware that either

Otherwise, if the backup completes before AutoBackup reports an error

For context, my full backups are about 100MB (occasionally up to 160MB) and partial backups are 60MB and I've never had an issue with timeouts or otherwise. They take about 1.5 minutes and 30 seconds respectively.

Which occurs first, the backup being created or Auto Backup reporting an error? This will tell me which path to go down.

danlun commented 4 years ago

Just reading and might be totally wrong but cant a external storage situation or similar make time outs when not wanted. I have no idea of course in this case if storage is on another unit with a path, but it was just a thought in my head. Take care.

Melkor82 commented 4 years ago

Hello, I'm facing the exact issue. I can say that everything was working fine with the same backup dimensions before until some version before. I would say until 106.0 or something was working fine

johny-mnemonic commented 4 years ago

I am not sure, whether I have the same issue, but my list of snapshots got really long and I started to look why the purging is not working and it seems to have been caused by the fact, that all requests for snapshot creation were timing out and hence autobackup was not getting their IDs. When I found it, my home-assistant_v2.db already had 16GB and both full and partial snapshots had around 900MB. So I increased the timeout for snapshot creation and since then the monitored snapshots number is raising from zero, where it was. I also cleaned my HA DB and reduced it to 2,7GB, so it improved snapshot times.

fantnhu commented 8 months ago

Same error. I create the backups with Node-Red, they succeed. I run the purge before every new save (they are set to 2 days). Unfortunately, it doesn't delete anything, it doesn't monitor anything, it just makes new ones. Now I have a 200Gb backup because nothing is deleted by the purge and the number of monitored backups: 0

psitem commented 8 months ago

I'm a recent convert and confirm that the keep_days setting and auto_backup.purge service call only impact backups created by auto_backup.

fantnhu commented 8 months ago

I'm a recent convert and confirm that the keep_days setting and auto_backup.purge service call only impact backups created by auto_backup.

ok i think everyone is aware of this. we have read the documentation.