rustic-rs / rustic_core

rustic_core - library for fast, encrypted, deduplicated backups that powers rustic-rs
https://rustic.cli.rs/ecosystem/rustic-core/
Apache License 2.0
38 stars 15 forks source link

Not enough context to root-cause S3 access error #346

Open philipmw opened 2 weeks ago

philipmw commented 2 weeks ago

I am having a problem configuring rustic with a set of hot and cold S3 buckets. That may be its own bug, as I believe all permissions are properly configured, but the meta-bug that I want to report here is that rustic doesn't give me enough debug information, even at trace log level, to root-cause the issue.

What I want: for rustic to print the S3 response so I can see why S3 is returning a 400 status code.

What I get: just the status code and headers. S3 has a bunch of reasons why it would give a 400 status code, so I need more.

% rustic init --log-level trace
[INFO] using config ./rustic.toml
[DEBUG] (1) opendal::services::s3::backend: backend build started: S3Builder { config: S3Config { root: None, bucket: "pmw-backups-laptop-rustic-cold", endpoint: None, region: None, .. }, .. }
[DEBUG] (1) opendal::services::s3::backend: backend use root /
[DEBUG] (1) opendal::services::s3::backend: backend use bucket pmw-backups-laptop-rustic-cold
[DEBUG] (1) reqsign::aws::config: load_via_profile_config_file failed: No such file or directory (os error 2)
[DEBUG] (1) opendal::services::s3::backend: backend use region: us-west-2
[DEBUG] (1) opendal::services::s3::backend: backend use endpoint: https://s3.us-west-2.amazonaws.com/pmw-backups-laptop-rustic-cold
[TRACE] (1) mio::poll: [/private/tmp/nix-build-rustic-0.9.4.drv-0/rustic-0.9.4-vendor.tar.gz/mio/src/poll.rs:569] registering event source with poller: token=Token(1), interests=READABLE
[DEBUG] (1) opendal::services: service=s3 name=pmw-backups-laptop-rustic-cold : metadata started
[DEBUG] (1) opendal::services: service=s3 name=pmw-backups-laptop-rustic-cold : metadata finished
[DEBUG] (1) opendal::services: service=s3 name=pmw-backups-laptop-rustic-cold : metadata started
[DEBUG] (1) opendal::services: service=s3 name=pmw-backups-laptop-rustic-cold : metadata finished
[DEBUG] (1) opendal::services: service=s3 name=pmw-backups-laptop-rustic-cold : metadata started
[DEBUG] (1) opendal::services: service=s3 name=pmw-backups-laptop-rustic-cold : metadata finished
[DEBUG] (1) opendal::services::s3::backend: backend build started: S3Builder { config: S3Config { root: None, bucket: "pmw-backups-laptop-rustic-hot", endpoint: None, region: None, .. }, .. }
[DEBUG] (1) opendal::services::s3::backend: backend use root /
[DEBUG] (1) opendal::services::s3::backend: backend use bucket pmw-backups-laptop-rustic-hot
[DEBUG] (1) reqsign::aws::config: load_via_profile_config_file failed: No such file or directory (os error 2)
[DEBUG] (1) opendal::services::s3::backend: backend use region: us-west-2
[DEBUG] (1) opendal::services::s3::backend: backend use endpoint: https://s3.us-west-2.amazonaws.com/pmw-backups-laptop-rustic-hot
[DEBUG] (1) opendal::services: service=s3 name=pmw-backups-laptop-rustic-hot : metadata started
[DEBUG] (1) opendal::services: service=s3 name=pmw-backups-laptop-rustic-hot : metadata finished
[DEBUG] (1) opendal::services: service=s3 name=pmw-backups-laptop-rustic-hot : metadata started
[DEBUG] (1) opendal::services: service=s3 name=pmw-backups-laptop-rustic-hot : metadata finished
[DEBUG] (1) opendal::services: service=s3 name=pmw-backups-laptop-rustic-hot : metadata started
[DEBUG] (1) opendal::services: service=s3 name=pmw-backups-laptop-rustic-hot : metadata finished
[DEBUG] (1) opendal::services: service=s3 name=pmw-backups-laptop-rustic-cold : metadata started
[DEBUG] (1) opendal::services: service=s3 name=pmw-backups-laptop-rustic-cold : metadata finished
[DEBUG] (1) opendal::services: service=s3 name=pmw-backups-laptop-rustic-hot : metadata started
[DEBUG] (1) opendal::services: service=s3 name=pmw-backups-laptop-rustic-hot : metadata finished
[TRACE] (1) rustic_backend::opendal: [/private/tmp/nix-build-rustic-0.9.4.drv-0/rustic-0.9.4-vendor.tar.gz/rustic_backend/src/opendal.rs:187] listing tpe: Config
[DEBUG] (1) opendal::services: service=s3 name=pmw-backups-laptop-rustic-cold path=config: stat started
[DEBUG] (1) reqsign::aws::v4: calculated scope: 20241030/us-west-2/s3/aws4_request
[DEBUG] (1) reqsign::aws::v4: calculated string to sign: AWS4-HMAC-SHA256
20241030T235504Z
20241030/us-west-2/s3/aws4_request
9df280dec8161b026ac8e1bea92c72e1df19c07f80c33cef7bbc1b230dbdb4e7
[TRACE] (1) hyper_util::client::legacy::pool: [/private/tmp/nix-build-rustic-0.9.4.drv-0/rustic-0.9.4-vendor.tar.gz/hyper-util/src/client/legacy/pool.rs:671] checkout waiting for idle connection: ("https", s3.us-west-2.amazonaws.com)
[DEBUG] (1) reqwest::connect: starting new connection: https://s3.us-west-2.amazonaws.com/
[TRACE] (1) hyper_util::client::legacy::connect::http: [/private/tmp/nix-build-rustic-0.9.4.drv-0/rustic-0.9.4-vendor.tar.gz/hyper-util/src/client/legacy/connect/http.rs:372] Http::connect; scheme=Some("https"), host=Some("s3.us-west-2.amazonaws.com"), port=None
[DEBUG] (1) hyper_util::client::legacy::connect::dns: resolve; host=s3.us-west-2.amazonaws.com
[TRACE] (1) tracing::span: [/private/tmp/nix-build-rustic-0.9.4.drv-0/rustic-0.9.4-vendor.tar.gz/hyper-util/src/client/legacy/connect/dns.rs:121] -- resolve;
[DEBUG] (14) hyper_util::client::legacy::connect::dns: resolving host="s3.us-west-2.amazonaws.com"
[DEBUG] (1) hyper_util::client::legacy::connect::http: connecting to 52.92.153.192:443
[TRACE] (1) mio::poll: [/private/tmp/nix-build-rustic-0.9.4.drv-0/rustic-0.9.4-vendor.tar.gz/mio/src/poll.rs:569] registering event source with poller: token=Token(140479080479232), interests=READABLE | WRITABLE
[DEBUG] (1) hyper_util::client::legacy::connect::http: connected to 52.92.153.192:443
[TRACE] (1) hyper_util::client::legacy::client: [/private/tmp/nix-build-rustic-0.9.4.drv-0/rustic-0.9.4-vendor.tar.gz/hyper-util/src/client/legacy/client.rs:592] http1 handshake complete, spawning background dispatcher task
[TRACE] (1) hyper_util::client::legacy::pool: [/private/tmp/nix-build-rustic-0.9.4.drv-0/rustic-0.9.4-vendor.tar.gz/hyper-util/src/client/legacy/pool.rs:713] checkout dropped for ("https", s3.us-west-2.amazonaws.com)
[TRACE] (4) mio::poll: [/private/tmp/nix-build-rustic-0.9.4.drv-0/rustic-0.9.4-vendor.tar.gz/mio/src/poll.rs:700] deregistering event source from poller
[ERROR] service=s3 name=pmw-backups-laptop-rustic-cold path=config: stat failed Unexpected (persistent) at stat

Context:
   uri: https://s3.us-west-2.amazonaws.com/pmw-backups-laptop-rustic-cold/config
   response: Parts { status: 400, version: HTTP/1.1, headers: {"x-amz-request-id": "FVWGD70YSD2FB1EF", "x-amz-id-2": "wnJtc3uryZlrkr99gmE1z1YUOEFvLz1uin6C2do8Em+tn61Q4+l/jX7y7nCPYO0728mpDCyFR5Q=", "content-type": "application/xml", "date": "Wed, 30 Oct 2024 23:55:04 GMT", "server": "AmazonS3", "connection": "close"} }
   service: s3
   path: config

[DEBUG] (1) opendal::services: service=s3 name=pmw-backups-laptop-rustic-cold path=config: stat failed
error: error listing the repo config file

My request: print the full response from S3 when the log level is trace or even debug.

This is what the AWS CLI does -- when verbose output is enabled, I see the full request and response payload.

Xuanwo commented 1 day ago

Hi, stat is a HEAD request, which doesn't have a body, so we don't have additional information to return here.

It's very weird that HEAD can return a 400 response since the request seems correct to me. Could you share more details about your setup? If you are using a static key, please double-check if the access key and secret key are configured correctly.

philipmw commented 1 day ago

Hello! I didn't realize that this is a HEAD request and that there's no more info to show me.

Yes, this problem has been perplexing me. I quadruple checked the access key, and it is the same bucket config that I use for restic that works. (Though I have not tried restic with this bucket.)

Here is my full rustic.toml:

[backup]
glob-files = ["/Users/pmw/Documents/laptop/backups/includes.dat"]

[repository]
repository = "opendal:s3"
repo-hot = "opendal:s3"
password-file = "/Users/pmw/Documents/laptop/backups/restic-repo-password.dat"

[repository.options]
access_key_id = "AKIA..."
secret_access_key = "..."

[repository.options-cold]
bucket = "pmw-backups-laptop-rustic-cold"
default_storage_class = "DEEP_ARCHIVE"

[repository.options-hot]
bucket = "pmw-backups-laptop-rustic-hot"
Xuanwo commented 1 day ago

Hi, @philipmw, thank you for providing more information on this. It's really helpful.

I now believe it's related to the DEEP_ARCHIVE storage class, which needs to be restored before it can be read. Calling stat (also known as HeadObject) on an archived file may return a 400 error. Please refer to https://docs.aws.amazon.com/AmazonS3/latest/API/API_RestoreObject.html for more details.

To verify my answer, please try using s3cmd to send a HeadObject or GetObject request to the same object.


cc @simonsan and @aawsome, perhaps we should consider handling config differently by not storing it as DEEP_ARCHIVE.

philipmw commented 1 day ago

Thanks. The command I am running is rustic init, and both buckets are newly created and empty. I don't know what archived objects rustic is trying to stat or what I can do to get around it.

Xuanwo commented 1 day ago

Thanks. The command I am running is rustic init, and both buckets are newly created and empty. I don't know what archived objects rustic is trying to stat or what I can do to get around it.

Hi, by setting default_storage_class = "DEEP_ARCHIVE", all files written in pmw-backups-laptop-rustic-cold will be in DEEP_ARCHIVE storage class.

philipmw commented 1 day ago

Is this not what I should be doing? I do want the -cold bucket to be in Deep Archive, and my understanding is that rustic supports this. An example of someone else configuring Deep Archive: https://kmh.prasil.info/posts/rustic-cold-storage-glacier-migration-configuration/

What's the right way to use Deep Archive with rustic?

aawsome commented 1 day ago

Actually rustic should write the config file into the cold storage location as DEEP_ARCHIVE, if it is specified like that. While rustic will not access any file/object in the cold storage, it requires to list contents. For config, it uses Operator::exists which seems to be calling stat internally. I wonder why this would return a 400 error.

@philipmw Did you ensure you have s3:GetObject and s3:ListBucket permission on your S3 bucket? If the error really is a 400 Bad Request error, the AWS docu says that only occurs when using encryption and wrong encryption request headers. Can you double check on this?

aawsome commented 1 day ago

Actually another test to rule out any permission stuff would be to run the same config without using default_storage_class and see if the error still exists...