duplicati / duplicati

Store securely encrypted backups in the cloud!
Other
11.25k stars 903 forks source link

Add support for cloudflare R2 #4673

Open option-greek opened 2 years ago

option-greek commented 2 years ago

Environment info

Description

It will be nice to have cloud flare R2 supported as backend. As cloudflare doesn't charge for egress, having it as a backend helps reduce the surprise costs of restore which tend to be very high (due to egress being high across multiple providers). It will also be nice to keep the number of requests per second to below the free limit to ensure backup also doesn't incur any API costs.

Steps to reproduce

In the supported backend list, cloudflare R2 isn't present.

Screenshots

Debug log

ts678 commented 2 years ago

Cloudflare R2 was announced in September 2021 as an S3-compatible service, but it doesn't appear to be generally available yet. For S3 compatible services, Duplicati generally uses the S3 Compatible storage type plus a Server dropdown for common ones. Adding another would probably be pretty simple if you know what to use, but maybe Custom server url would also work fine.

I'm not sure how possible limiting the requests per second is, unless Amazon's or Minio's libraries (used here) can support that... The AWS SDK has a lot of options, shown in the Advanced options dropdown, but I'm not finding one talking about throttling. This may or may not be an actual problem, as you might hit some other limit (such as speed of file transfers) before an API limit.

I don't use S3, and I am not expert in S3 pricing or what makes an API call. Duplicati does uploads, downloads, lists, and deletes. Anybody who can translate that into potential for API costs on Cloudflare R2 is invited to comment on the chances of charges...

duplicatibot commented 2 years ago

This issue has been mentioned on Duplicati. There might be relevant details there:

https://forum.duplicati.com/t/restore-backblaze-b2-files-via-cloudflare-proxy-url/13825/5

michael9dk commented 2 years ago

Notice the price for Cloudflare R2. It costs $0.015 per GB. Backblaze B2 is $0.005 per GB. Egress is $0.010 per GB.

The FREE way to get free egress, without additionbal cost, is explained here.

https://www.backblaze.com/blog/backblaze-and-cloudflare-partner-to-provide-free-data-transfer/

https://help.backblaze.com/hc/en-us/articles/217666928-Using-Backblaze-B2-with-the-Cloudflare-CDN

https://help.backblaze.com/hc/en-us/articles/360010017893-How-to-allow-Cloudflare-to-fetch-content-from-a-Backblaze-B2-private-bucket

jre21 commented 2 years ago

I just tried configuring Cloudflare R2 with the S3 compatible storage type, and ran into a series of problems.

First of all, Cloudflare's S3 endpoint is https://<account_id>.r2.cloudflarestorage.com, so adding it to the S3 drop-down wouldn't be feasible unless Duplicati can dynamically add a field for the account ID.

Second, Cloudflare's R2 API requires TLS. It took me considerable trial and error plus skimming Duplicati's source code to realize that a) "custom server url" needs to be a hostname (i.e., without the scheme prefix, b) the s3 storage type uploads over plaintext http by default and the use-ssl flag is buried in the advanced menu. Moving the use-ssl flag into the main menu would be a big help to anyone trying to work with custom S3 backends.

Third, once I'd set up the configuration correctly and tried running a backup, the Cloudflare server quickly returned (501) Not Implemented. R2 has an incomplete implementation of the S3 protocol as documented here, and my guess is that Duplicati's s3 client sets configuration options that it doesn't (yet?) support. I'd be happy to debug further if someone could give me a pointer on how to see the specific s3 API calls Duplicati is making.

ts678 commented 2 years ago

I'd be happy to debug further if someone could give me a pointer on how to see the specific s3 API calls Duplicati is making.

Help is always most welcome, but there's not a whole lot of help available for the helpers (due to lack of developer-volunteers).

Using AWS SDKs to Obtain Request IDs starts talking about low-level logging, and links to Logging with the AWS SDK for .NET.

Note that there's also a "Minio SDK" client library. I don't know if it might work any better, and I haven't looked into its logging.

duplicatibot commented 1 year ago

This issue has been mentioned on Duplicati. There might be relevant details there:

https://forum.duplicati.com/t/back-up-to-cloudflare-r2-storage-fails/15511/3

altendorfme commented 5 months ago

Configuration:

Storage type: S3 Compatible
Use SSL: check
Server: Custom server url: {account_id}.r2.cloudflarestorage.com (dont use https://)
Advanced Options:
- s3-ext-disablehostprefixinjection: check
- s3-disable-chunk-encoding: check
duplicatibot commented 5 months ago

This issue has been mentioned on Duplicati. There might be relevant details there:

https://forum.duplicati.com/t/back-up-to-cloudflare-r2-storage-fails/15511/18

ts678 commented 5 months ago

@altendorfme thanks for the test. I linked it from the similar issue in the forum. Since the s3-disable-chunk-encoding seemingly helps (and thanks for the rest), I'm wondering if this issue should be considered solved, but I'll wait awhile on it. Needing special settings is not ideal, but sometimes providers will require that...

duplicatibot commented 5 months ago

This issue has been mentioned on Duplicati. There might be relevant details there:

https://forum.duplicati.com/t/back-up-to-cloudflare-r2-storage-fails/15511/19

duplicatibot commented 5 months ago

This issue has been mentioned on Duplicati. There might be relevant details there:

https://forum.duplicati.com/t/back-up-to-cloudflare-r2-storage-fails/15511/20

W4JEW commented 5 months ago

Hello! I was the individual who started the Back Up to Cloudflare R2 Storage Fails thread on the Duplicati forums (back in December 2022).

I didn't revisit this issue until earlier today after coming across a message that indicated perhaps the issue was resolved or a workaround was discovered.

I followed the details in the thread and this issue. I made it quite a bit further than before, but the backup eventually failed.

What's interesting is it looks like the data was written to R2. The amount of data in the newly created R2 storage bucket is roughly equal to the amount of data consumed in the directory structure on the filesystem.

Steps to reproduce

  1. Fresh installation of Duplicati
  2. Created a new Cloudflare R2 storage bucket
  3. Created new R2 credentials with "Object Read & Write" permissions
  4. I used the Duplicati web interface to create a new backup job.

General backup Settings

Backup Destination

Advanced Options

NOTE: I selected 'Test connection' to verify the credentials.

I received the following warning message:

Adjust bucket name?
The bucket name should start with your username, prepend automatically? 

If I selected "Yes", the test failed. If I selected "No" the test was successful ("Connection worked!")

Source Data

I selected a directory on the local filesystem.

Schedule

I did not specify a schedule. I just wanted to run a manual backup to ensure everything worked.

The backup job started and it looked like it was making really good progress. Then, about 80/90% of the way through the job, I received the following error message in the Duplicati web interface:

Unexpected number of remote volumes marked as deleted. Found 0 filesets, but 1 volumes

I expanded the "Operation backup failed" section and found there were two errors:

2024-05-26 20:19:47 -04 - [Error-Duplicati.Library.Main.Operation.BackupHandler-FatalError]: Fatal error
WebException: The remote server returned an error: (501) Not Implemented.
2024-05-26 20:19:47 -04 - [Error-Duplicati.Library.Main.Controller-FailedOperation]: The operation Backup has failed with error: One or more errors occurred. (The remote server returned an error: (501) Not Implemented. (The remote server returned an error: (501) Not Implemented.) (One or more errors occurred. (The remote server returned an error: (501) Not Implemented.)))
AggregateException: One or more errors occurred. (The remote server returned an error: (501) Not Implemented. (The remote server returned an error: (501) Not Implemented.) (One or more errors occurred. (The remote server returned an error: (501) Not Implemented.)))

I expanded the "Complete Log" section and found the following:

            {
  "DeletedFiles": 0,
  "DeletedFolders": 0,
  "ModifiedFiles": 0,
  "ExaminedFiles": 7,
  "OpenedFiles": 5,
  "AddedFiles": 5,
  "SizeOfModifiedFiles": 0,
  "SizeOfAddedFiles": 267388003,
  "SizeOfExaminedFiles": 361624107,
  "SizeOfOpenedFiles": 267388003,
  "NotProcessedFiles": 0,
  "AddedFolders": 3,
  "TooLargeFiles": 0,
  "FilesWithError": 0,
  "ModifiedFolders": 0,
  "ModifiedSymlinks": 0,
  "AddedSymlinks": 0,
  "DeletedSymlinks": 0,
  "PartialBackup": false,
  "Dryrun": false,
  "MainOperation": "Backup",
  "CompactResults": null,
  "VacuumResults": null,
  "DeleteResults": null,
  "RepairResults": null,
  "TestResults": null,
  "ParsedResult": "Fatal",
  "Interrupted": false,
  "Version": "2.0.8.1 (2.0.8.1_beta_2024-05-07)",
  "EndTime": "2024-05-27T00:19:47.043413Z",
  "BeginTime": "2024-05-27T00:18:14.359965Z",
  "Duration": "00:01:32.6834480",
  "MessagesActualLength": 100,
  "WarningsActualLength": 0,
  "ErrorsActualLength": 2,
  "Messages": [
    "2024-05-26 20:18:14 -04 - [Information-Duplicati.Library.Main.Controller-StartingOperation]: The operation Backup has started",
    "2024-05-26 20:18:14 -04 - [Information-Duplicati.Library.Main.BasicResults-BackendEvent]: Backend event: List - Started:  ()",
    "2024-05-26 20:18:14 -04 - [Information-Duplicati.Library.Main.BasicResults-BackendEvent]: Backend event: List - Completed:  ()",
    "2024-05-26 20:18:24 -04 - [Information-Duplicati.Library.Main.BasicResults-BackendEvent]: Backend event: Put - Started: duplicati-b4077b2fd410c403e9dc58ab8562c2b3c.dblock.zip.aes (49.91 MB)",
    "2024-05-26 20:18:25 -04 - [Information-Duplicati.Library.Main.BasicResults-BackendEvent]: Backend event: Put - Started: duplicati-b0a0e08f553d74e79b61816dfe8c52c9f.dblock.zip.aes (49.97 MB)",
    "2024-05-26 20:18:30 -04 - [Information-Duplicati.Library.Main.BasicResults-BackendEvent]: Backend event: Put - Retrying: duplicati-b4077b2fd410c403e9dc58ab8562c2b3c.dblock.zip.aes (49.91 MB)",
    "2024-05-26 20:18:30 -04 - [Information-Duplicati.Library.Main.BasicResults-BackendEvent]: Backend event: Put - Retrying: duplicati-b0a0e08f553d74e79b61816dfe8c52c9f.dblock.zip.aes (49.97 MB)",
    "2024-05-26 20:18:36 -04 - [Information-Duplicati.Library.Main.BasicResults-BackendEvent]: Backend event: Put - Started: duplicati-b1f53b6a35c0647ccbe12ef56bba1f6f1.dblock.zip.aes (49.96 MB)",
    "2024-05-26 20:18:37 -04 - [Information-Duplicati.Library.Main.BasicResults-BackendEvent]: Backend event: Put - Started: duplicati-bbd00d9dbd79249219061fd2cc8463c3b.dblock.zip.aes (49.95 MB)",
    "2024-05-26 20:18:40 -04 - [Information-Duplicati.Library.Main.BasicResults-BackendEvent]: Backend event: Put - Rename: duplicati-b4077b2fd410c403e9dc58ab8562c2b3c.dblock.zip.aes (49.91 MB)",
    "2024-05-26 20:18:40 -04 - [Information-Duplicati.Library.Main.BasicResults-BackendEvent]: Backend event: Put - Rename: duplicati-b7018842e152b44b1adb8f87417f76752.dblock.zip.aes (49.91 MB)",
    "2024-05-26 20:18:40 -04 - [Information-Duplicati.Library.Main.Operation.Backup.BackendUploader-RenameRemoteTargetFile]: Renaming \"duplicati-b4077b2fd410c403e9dc58ab8562c2b3c.dblock.zip.aes\" to \"duplicati-b7018842e152b44b1adb8f87417f76752.dblock.zip.aes\"",
    "2024-05-26 20:18:40 -04 - [Information-Duplicati.Library.Main.BasicResults-BackendEvent]: Backend event: Put - Started: duplicati-b7018842e152b44b1adb8f87417f76752.dblock.zip.aes (49.91 MB)",
    "2024-05-26 20:18:40 -04 - [Information-Duplicati.Library.Main.BasicResults-BackendEvent]: Backend event: Put - Rename: duplicati-b0a0e08f553d74e79b61816dfe8c52c9f.dblock.zip.aes (49.97 MB)",
    "2024-05-26 20:18:40 -04 - [Information-Duplicati.Library.Main.BasicResults-BackendEvent]: Backend event: Put - Rename: duplicati-b2c6fc4527d524e4ba2e8162b4cd8318c.dblock.zip.aes (49.97 MB)",
    "2024-05-26 20:18:40 -04 - [Information-Duplicati.Library.Main.Operation.Backup.BackendUploader-RenameRemoteTargetFile]: Renaming \"duplicati-b0a0e08f553d74e79b61816dfe8c52c9f.dblock.zip.aes\" to \"duplicati-b2c6fc4527d524e4ba2e8162b4cd8318c.dblock.zip.aes\"",
    "2024-05-26 20:18:40 -04 - [Information-Duplicati.Library.Main.BasicResults-BackendEvent]: Backend event: Put - Started: duplicati-b2c6fc4527d524e4ba2e8162b4cd8318c.dblock.zip.aes (49.97 MB)",
    "2024-05-26 20:18:42 -04 - [Information-Duplicati.Library.Main.BasicResults-BackendEvent]: Backend event: Put - Retrying: duplicati-b1f53b6a35c0647ccbe12ef56bba1f6f1.dblock.zip.aes (49.96 MB)",
    "2024-05-26 20:18:43 -04 - [Information-Duplicati.Library.Main.BasicResults-BackendEvent]: Backend event: Put - Retrying: duplicati-bbd00d9dbd79249219061fd2cc8463c3b.dblock.zip.aes (49.95 MB)",
    "2024-05-26 20:18:46 -04 - [Information-Duplicati.Library.Main.BasicResults-BackendEvent]: Backend event: Put - Retrying: duplicati-b2c6fc4527d524e4ba2e8162b4cd8318c.dblock.zip.aes (49.97 MB)"
  ],
  "Warnings": [],
  "Errors": [
    "2024-05-26 20:19:47 -04 - [Error-Duplicati.Library.Main.Operation.BackupHandler-FatalError]: Fatal error\nWebException: The remote server returned an error: (501) Not Implemented.",
    "2024-05-26 20:19:47 -04 - [Error-Duplicati.Library.Main.Controller-FailedOperation]: The operation Backup has failed with error: One or more errors occurred. (The remote server returned an error: (501) Not Implemented. (The remote server returned an error: (501) Not Implemented.) (One or more errors occurred. (The remote server returned an error: (501) Not Implemented.)))\nAggregateException: One or more errors occurred. (The remote server returned an error: (501) Not Implemented. (The remote server returned an error: (501) Not Implemented.) (One or more errors occurred. (The remote server returned an error: (501) Not Implemented.)))"
  ],
  "BackendStatistics": {
    "RemoteCalls": 23,
    "BytesUploaded": 0,
    "BytesDownloaded": 0,
    "FilesUploaded": 0,
    "FilesDownloaded": 0,
    "FilesDeleted": 0,
    "FoldersCreated": 0,
    "RetryAttempts": 20,
    "UnknownFileSize": 0,
    "UnknownFileCount": 0,
    "KnownFileCount": 0,
    "KnownFileSize": 0,
    "LastBackupDate": "0001-01-01T00:00:00",
    "BackupListCount": 0,
    "TotalQuotaSpace": 0,
    "FreeQuotaSpace": 0,
    "AssignedQuotaSpace": -1,
    "ReportedQuotaError": false,
    "ReportedQuotaWarning": false,
    "MainOperation": "Backup",
    "ParsedResult": "Success",
    "Interrupted": false,
    "Version": "2.0.8.1 (2.0.8.1_beta_2024-05-07)",
    "EndTime": "0001-01-01T00:00:00",
    "BeginTime": "2024-05-27T00:18:14.35997Z",
    "Duration": "00:00:00",
    "MessagesActualLength": 0,
    "WarningsActualLength": 0,
    "ErrorsActualLength": 0,
    "Messages": null,
    "Warnings": null,
    "Errors": null
  }
}

Update to my post - I noticed there were two additional attempts to perform the backup.

There are two errors and one warning in each of the two subsequent attempts:

Warning

2024-05-26 22:34:17 -04 - [Warning-Duplicati.Library.Main.Operation.Backup.UploadSyntheticFilelist-MissingTemporaryFilelist]: Expected there to be a temporary fileset for synthetic filelist (1, duplicati-20240527T002144Z.dlist.zip.aes), but none was found?

Error 01 of 02

2024-05-26 22:37:42 -04 - [Error-Duplicati.Library.Main.Operation.BackupHandler-FatalError]: Fatal error
Exception: Unexpected number of remote volumes marked as deleted. Found 0 filesets, but 1 volumes

Error 02 of 02

2024-05-26 22:37:42 -04 - [Error-Duplicati.Library.Main.Controller-FailedOperation]: The operation Backup has failed with error: Unexpected number of remote volumes marked as deleted. Found 0 filesets, but 1 volumes
Exception: Unexpected number of remote volumes marked as deleted. Found 0 filesets, but 1 volumes

This is a screenshot from the Cloudflare dashboard showing the R2 storage bucket:

image

kenkendk commented 5 months ago

@W4JEW, looking at the log, I see the retry uploads:

"2024-05-26 20:18:30 -04 - [Information-Duplicati.Library.Main.BasicResults-BackendEvent]: Backend event: Put - Retrying: duplicati-b4077b2fd410c403e9dc58ab8562c2b3c.dblock.zip.aes (49.91 MB)",

This indicates that the upload (PUT) fails, presumably with the 501 error. But from the screenshot, it appears that some files were actually uploaded?

I think the other errors are caused by this issue.

I cannot see any options that should be applied to the PUT request that are not listed as supported by R2, but the downside of using the official S3 client is that it tends to cater to the AWS setup, sometimes adding "helpful" extra values.

You can try using the alternate minio client by adding the advanced option:

--s3-client=minio

This client is slightly less opinionated and tends to work better with non-AWS destinations.

I received the following warning message:

Adjust bucket name?
The bucket name should start with your username, prepend automatically? 

That is an S3 thing. In AWS S3 the bucket names are global, so Amazon recommends using the account id as a prefix for the bucket name. For R2 the account name is encoded in the hostname instead.

First of all, Cloudflare's S3 endpoint is https://.r2.cloudflarestorage.com, so adding it to the S3 drop-down wouldn't be feasible unless Duplicati can dynamically add a field for the account ID.

Second, Cloudflare's R2 API requires TLS. It took me considerable trial and error plus skimming Duplicati's source code to realize that a) "custom server url" needs to be a hostname (i.e., without the scheme prefix, b) the s3 storage type uploads over plaintext http by default and the use-ssl flag is buried in the advanced menu. Moving the use-ssl flag into the main menu would be a big help to anyone trying to work with custom S3 backends.

@jre21 I think the use-ssl option should be deprecated in favor of a disable-ssl option, but when Duplicati started, SSL was not common.

I think the UI here is oriented very much towards the way AWS is working. For something like R2 where the account id is in the hostname, it would make sense to make a different UI.

W4JEW commented 5 months ago

@kenkendk you are correct that, despite the 501 errors, the files appeared to have been backed up as expected.

it looks like adding --s3-client=minio did the trick!

For anyone that wants to try this - I added the --s3-client=minio option in the following file:

/etc/default/duplicati
# Defaults for duplicati initscript
# sourced by /etc/init.d/duplicati
# installed at /etc/default/duplicati by the maintainer scripts

#
# This is a POSIX shell fragment
#

# Additional options that are passed to the Daemon.
DAEMON_OPTS="--webservice-interface=any --s3-client=minio"

NOTE: My config also includes --webservice-interface=any but you can safely omit that option if you don't need it. The web service binds to 127.0.0.1 by default and I wanted to be able to access the Duplicati web admin interface (port 8200/tcp) from a different computer.

Once I added --s3-client=minio to the configuration file, I simply restarted Duplicati and checked the status:

sudo systemctl restart duplicati.service
sudo systemctl status duplicati.service

I have little prior exposure to AWS S3, so I was surprised to learn that S3 doesn't use SSL/TLS by default. I understand not wanting the overhead of SSL/TLS encrypt/decrypt.

Cloudflare defaults to SSL/TLS encryption everywhere. They automatically redirect any HTTP requests to HTTPS.

Since Cloudflare R2 seems to stray from the typical S3 workflow, it would be ideal to have a Cloudflare R2 provider to automate the selection of ideal settings.

A point of notable mention is that Cloudflare added "jurisdiction-specific endpoints for S3 clients," whereby they provide a unique endpoint specifically for the European Union (EU).

For example:

If the S3 client endpoint is: https://{{my-cloudflare-account-id}}.r2.cloudflarestorage.com

The European Union (EU) S3 client endpoint is: https://{{my-cloudflare-account-id}}.eu.r2.cloudflarestorage.com

European customers/users would appreciate an option that automatically includes the .eu subdomain.

kenkendk commented 5 months ago

For anyone that wants to try this - I added the --s3-client=minio option in the following file:

/etc/default/duplicati

The intention for that file is to have global settings only. In this case the --s3-client option is only relevant for the S3 backend, so if you have backups to destinations other than S3, you will get warnings about an unused option.

You can instead add the option on the backup, which would also allow you to have backups to both R2 and AWS.

I have little prior exposure to AWS S3, so I was surprised to learn that S3 doesn't use SSL/TLS by default. I understand not wanting the overhead of SSL/TLS encrypt/decrypt.

I think this is just because of the age of S3. Had it been created today it would be based on TLS. Back then there was a concern about the additional CPU & latency required on S3 servers for establishing the secure channel, so they designed the signature to allow plain-text authentication.