duplicati / duplicati

Store securely encrypted backups in the cloud!
Other
11.03k stars 897 forks source link

Duplicati silently, permanently deleted backup from google drive - two-machine use case #3845

Open dananarama opened 5 years ago

dananarama commented 5 years ago

Environment info

Description

Duplicati silently and permanently deleted backup files from the remote server. Use case: Backing up from two machines onto google drive.
Scenario: Backed up on machine one. On machine two, attempted a restore from the existing configuration. Got an error message that there were unrecognized files and that a repair was needed. Clicked repair. After the fact noted that the most recent backup (from machine one) was not listed as an option in restore. Looked at google drive and saw in activity that the backup files were permanently deleted at the time I performed the repair.

To restate: When trying to perform a restore from a second machine, Duplicati saw an unrecognized backup and permanently deleted it.

This is absolutely a deal-breaker. I have uninstalled Duplicati from my systems. I'm not sure how other people are able to use this.

Steps to reproduce

See above.

Screenshots

Debug log

drwtsn32x commented 5 years ago

Were the two machines backing up to exact same location in Google Drive?

dananarama commented 5 years ago

Yes. The use case was collaboration between two machines. Either machine backs up to the location and either machine can restore from it. So machine one can make changes, back up, and the other machine can restore to get the changes. In this scenario, both machines already had a configuration set up pointing to the google drive directory. One machine created the configuration and did the initial backup. The second machine created a new, identical configuration and did a repair to set up its initial database. This procedure was described by the Duplicati admins in a post online.

On Fri, Aug 2, 2019 at 1:18 PM drwtsn32x notifications@github.com wrote:

Were the two machines backing up to exact same location in Google Drive?

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/duplicati/duplicati/issues/3845?email_source=notifications&email_token=AMYYENG7ZXSQOXBUNGUPJPTQCR255A5CNFSM4II7ZBVKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD3OPUAQ#issuecomment-517798402, or mute the thread https://github.com/notifications/unsubscribe-auth/AMYYENGUJMPZEAT3LBBHBF3QCR255ANCNFSM4II7ZBVA .

drwtsn32x commented 5 years ago

I am familiar with the idea of duplicating a backup job config from PC-A onto PC-B, and then using Repair on PC-B to initialize the local database. This would allow you to restore data from PC-A through PC-B's Duplicati instance, or would be a good procedure if you are migrating to a new PC.

BUT

This would not allow you to simultaneously run backups from both PC-A and PC-B to the same back end target. They will just stomp all over each other.

The only way it could possibly work is:

In my opinion this workflow is horrible - Duplicati is not built for it. If you weren't following those above caveats I'm not surprised you ran into trouble.

It would be better to have each PC do its own backups to a different back end (or a different target folder in the same back end). If you need the ability to restore one PC's data onto the other, you could duplicate PC-A's job onto PC-B (and vice versa), but ONLY use it for restores. And you would be required to do a "repair" before any restore attempt to make sure the local database matches the back end storage.

dananarama commented 5 years ago

The act of doing a repair from PC-B prior to a restore is what caused Duplicati to delete the backup performed by PC-A. Duplicati warned of unrecognized files on the remote and recommended a repair. Performing this repair deleted the backup.

It seems if Duplicati was not meant to restore a backup to a PC other than the PC performing the original backup, this is a significant limitation and should be advertised clearly.

On Fri, Aug 2, 2019 at 3:02 PM drwtsn32x notifications@github.com wrote:

I am familiar with the idea of duplicating a backup job config from PC-A onto PC-B, and then using Repair on PC-B to initialize the local database. This would allow you to restore data from PC-A through PC-B's Duplicati instance, or would be a good procedure if you are migrating to a new PC.

BUT

This would not allow you to simultaneously run backups from both PC-A and PC-B to the same back end target. They will just stomp all over each other.

The only way it could possibly work is:

  • PC-A and PC-B must never run a backup at the same time.
  • Any time PC-A does a backup or otherwise modifies the back end data, a repair must be done on PC-B before you do a restore or backup job on PC-B.
  • Likewise any time PC-B does a backup or modifies the back end data, a repair must be done on PC-A before you do a restore or backup.
  • And maybe other things I'm not thinking of.

In my opinion this workflow is horrible - Duplicati is not built for it. If you weren't following those above caveats I'm not surprised you ran into trouble.

It would be better to have each PC do its own backups to a different back end (or a different target folder in the same back end). If you need the ability to restore one PC's data onto the other, you could duplicate PC-A's job onto PC-B (and vice versa), but ONLY use it for restores. And you would be required to a "repair" before any restore attempt to make sure the local database matches the back end storage.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/duplicati/duplicati/issues/3845?email_source=notifications&email_token=AMYYENGMFEZJMTCPPQPHTA3QCSHEXA5CNFSM4II7ZBVKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD3OWWIY#issuecomment-517827363, or mute the thread https://github.com/notifications/unsubscribe-auth/AMYYENAA5Z2G7KLAI376KDDQCSHEXANCNFSM4II7ZBVA .

drwtsn32x commented 5 years ago

It seems if Duplicati was not meant to restore a backup to a PC other than the PC performing the original backup, this is a significant limitation and should be advertised clearly.

You can do it - I have done it myself more than once. I imported/duplicated PC-A's backup job onto PC-B. I made sure I turned off the automatic schedule for this job on PC-B. I ran a repair to create the local database. I could then restore PC-A's files using PC-B's Duplicati without issue. But I never attempted to use PC-B to back up or modify the backup data that I consider to be "owned" by PC-A.

I'm not too familiar with the database repair process, but I thought it only modified the local database... An attempt to rebuild it based on data on the backend. I am not aware of the circumstances were it could or would try to delete back-end data.

dananarama commented 5 years ago

This is the crux of the issue I am reporting. Performing a repair did, in fact, result in Duplicati permanently deleting files from google drive. Note these were permanently deleted, not simply removed to the trash.

On Fri, Aug 2, 2019 at 3:19 PM drwtsn32x notifications@github.com wrote:

It seems if Duplicati was not meant to restore a backup to a PC other than the PC performing the original backup, this is a significant limitation and should be advertised clearly.

You can do it - I have done it myself more than once. I imported/duplicated PC-A's backup job onto PC-B. I made sure I turned off the automatic schedule for this job on PC-B. I ran a repair to create the local database. I could then restore PC-A's files using PC-B's Duplicati without issue. But I never attempted to use PC-B to back up or modify the backup data that I consider to be "owned" by PC-A.

I'm not too familiar with the database repair process, but I thought it only modified the local database... An attempt to rebuild it based on data on the backend. I am not aware of the circumstances were it could or would try to delete back-end data.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/duplicati/duplicati/issues/3845?email_source=notifications&email_token=AMYYENGBX6WJ5OZS5Z6A4PDQCSJEVA5CNFSM4II7ZBVKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD3OXYJY#issuecomment-517831719, or mute the thread https://github.com/notifications/unsubscribe-auth/AMYYENHMLV2NQH3OCDEC6B3QCSJEVANCNFSM4II7ZBVA .

BlueBlock commented 5 years ago

So it seems Duplicati is allowing action by the user which shouldn't allowed or at least warned against in the UI.

This goes more into another topic I've been thinking about. Many backup solutions also store at least a copy of the backup job itself on the destination. If Duplicati were to do the same then each job file could also contain an "owner" computer (some unique ID of the computer). If another computer is opening the job then non-readonly actions could be flagged as unavailable unless a choice was made to take ownership of the backup.

drwtsn32x commented 5 years ago

Great idea @BlueBlock - I like it

ts678 commented 5 years ago

Repair command deletes remote files for new backups #3416 is the same sort of problem, and people sometimes hit this when restoring DB from an old image file or other backup. Repair tries to reconcile the DB with files it doesn't know, and the result is bad. Repair and recreate are being rewritten by the lead author, but it's slow going, I know no details, and I especially don't know if it will keep the same approach of reconciling things by deletion. In GUI, user could be asked about it. From CLI I'm unsure.

Backup to the same bucket/folder should raise a warning has discussion of ideas on similar warnings.

dandar commented 5 years ago

How about:

  1. Use a sync tool like Syncthing to sync PC-A and PC-B, and use a backup tool like Duplicati on one PC (either A or B as the master PC).

  2. Limit Duplicati backup location/folder to one PC, unless restoring and taking ownership like BlueBlock suggested. If two PCs try to backup to same location/folder, Duplicati would warn|block the non-owner PC, and make a suggestion like 1. Also, Duplicati can make a subfolder with PC-id, which would allow users to backup multiple PCs to same location, so to implement de-duplication on the location/folder if different PCs have same encryption passphrase.

Anyways, Duplicati should WARN and require 2-3x CONFIRMATION before deleting any files on backup for db repair/reconcile/optimize, etc. If local files are missing on backup, also WARN user and suggest drive check/repair/recover/test/change, with a link to a how to or tutorial. User education is the essence of GUI success.

Xavron commented 3 years ago

Just had this happen today with local backup and an imported exported backup config.

It complains about files not matching and to repair the db (after that I noticed it looks like that's because it some weird Source data folder and path on import without the actual files)...

try repairing, deleting db, etc. It keeps complaining. Eventually it silently deleted the 15GB backup. I did not select to do that and was aware of it.

Edit: Will see if I can duplicate it and get a log for it. Since the other problem has been presented with additional problems in the meantime anyway getting in my way :-)

duplicatibot commented 2 years ago

This issue has been mentioned on Duplicati. There might be relevant details there:

https://forum.duplicati.com/t/best-way-to-kick-off-new-backups/14747/2

Radivarig commented 2 years ago

Severity of this issue is critical and should get more attention.

ts678 commented 2 years ago

should get more attention

I removed the "enhancement" label that a former developer added on Sep 7, 2019 when self-assigning this. No known action. That might at least get it off the priority reduction that enhancements may get. Any developer want to volunteer for this one? This includes community members who can contribute time and skills. Without volunteers, there's nobody to give attention...

duplicati-2.0.6.3-2.0.6.3_beta_20210617 just destoryed one month worth of backup #4579 are my ideas on a code approach. Unfortunately there are several open issues on this (follow the links), so discussions about how to handle it are kind of spread.

Although it's a poor substitute, non-C# volunteers who can do pull requests can contribute to the user manual which is here.