cube-js / cube

📊 Cube — The Semantic Layer for Building Data Applications
https://cube.dev
Other
17.8k stars 1.76k forks source link

cubestore S3 credentials regression in v0.35.17 #8163

Closed adamstruck closed 1 month ago

adamstruck commented 5 months ago

Describe the bug

When I attempted to upgrade from v0.35.14 I started to see this error:

2024-04-17T18:17:19.257Z INFO  [cubestored] <pid:1> Cube Store version 0.35.17
2024-04-17T18:17:19.260Z INFO  [cubestore::http::status] <pid:1> Serving status probes at 0.0.0.0:3031
thread 'main' panicked at /build/cubestore/cubestore/src/config/mod.rs:1816:86:
called `Result::unwrap()` on an `Err` value: CubeError { message: "Failed to create S3 credentials: attohttpc: Json Error: expected value at line 1 column 1", backtrace: "", cause: Internal }
stack backtrace:
   0: rust_begin_unwind
             at /rustc/190f4c96116a3b59b7de4881cfec544be0246d84/library/std/src/panicking.rs:647:5
   1: core::panicking::panic_fmt
             at /rustc/190f4c96116a3b59b7de4881cfec544be0246d84/library/core/src/panicking.rs:72:14
   2: core::result::unwrap_failed
             at /rustc/190f4c96116a3b59b7de4881cfec544be0246d84/library/core/src/result.rs:1649:5
   3: cubestore::config::injection::Injector::register::{{closure}}::{{closure}}::{{closure}}
   4: cubestore::config::injection::Injector::get_service::{{closure}}
   5: cubestore::config::injection::Injector::register::{{closure}}::{{closure}}::{{closure}}
   6: cubestore::config::injection::Injector::get_service::{{closure}}
   7: cubestore::config::injection::Injector::register_typed::{{closure}}::{{closure}}::{{closure}}
   8: cubestore::config::injection::Injector::get_service_typed::{{closure}}
   9: cubestore::config::injection::Injector::register_typed::{{closure}}::{{closure}}::{{closure}}
  10: cubestore::config::injection::Injector::get_service_typed::{{closure}}
  11: cubestore::config::injection::Injector::register_typed::{{closure}}::{{closure}}::{{closure}}
  12: cubestore::config::injection::Injector::get_service_typed::{{closure}}
  13: cubestored::main::{{closure}}
  14: tokio::runtime::park::CachedParkThread::block_on
  15: tokio::runtime::runtime::Runtime::block_on
  16: cubestored::main
note: Some details are omitted, run with `RUST_BACKTRACE=full` for a verbose backtrace.

Version: v0.35.17

adamstruck commented 5 months ago

I assume this is due to one of these PRs:

https://github.com/cube-js/cube/pull/6019 https://github.com/cube-js/cube/pull/8158

ovr commented 5 months ago

Hello @adamstruck ,

Did you specify both CUBESTORE_AWS_ACCESS_KEY_ID & CUBESTORE_AWS_SECRET_ACCESS_KEY or only CUBESTORE_AWS_SECRET_ACCESS_KEY?

Thanks

adamstruck commented 5 months ago

I only set CUBESTORE_AWS_CREDS_REFRESH_EVERY_MINS. The credentials used to be picked up automatically from the pod / instance. We have been running various versions of cube for the past year and this is the first time it has been a problem.

I am able to run v0.35.16 without issue.

ovr commented 5 months ago

Is it correct that you are using STS (AWS_WEB_IDENTITY_TOKEN_FILE), IAM?

"Failed to create S3 credentials: attohttpc: Json Error: expected value at line 1 column 1

This error comes from fallback logic that is trying to resolve auth from magic IP 169.254.169.254. It's a fallback variant.

adamstruck commented 5 months ago

Yes, s3 access is being managed using IAM roles for the nodes in the EKS cluster.

adamstruck commented 5 months ago

@ovr do you think this is something you will be able to fix soon?

ovr commented 5 months ago

@adamstruck I tried to find out what had changed in the library, but I could not find anything that can affect you.

adamstruck commented 5 months ago

Did cube start doing anything differently starting in that version?

adamstruck commented 5 months ago

Or maybe it is this line:

https://github.com/cube-js/cube/pull/6019/files#diff-28d0e549290be2ac2e69b3c1da8d7e05aa0f58db287f72872c53c93c496913a0R57

I am not familiar with rust dependency management, but this seems to imply that not all features are being included after the bump?

adamstruck commented 5 months ago

I still am running into this issue with the latest version: v0.35.22

ovr commented 5 months ago

We didn't start doing anything differently. I reviewed all changes in the rust-s3 crate and found nothing that could cause this error.

In the production, we use IAM, and it works correctly.

You can check it from the pod, via curl to http://169.254.169.254/latest/meta-data/iam/security-credentials. Next you can pass role to http://169.254.169.254/latest/meta-data/iam/security-credentials/{YOUR_ROLE}

Do you see any error? It should be a correct JSON.

Thanks

adamstruck commented 5 months ago

I think you are on to something...

$ curl http://169.254.169.254/latest/meta-data/iam/security-credentials/
empty role

AWS_ROLE_ARN is set in my env and so is AWS_WEB_IDENTITY_TOKEN_FILE.

Any ideas?

adamstruck commented 5 months ago

Based on https://github.com/durch/rust-s3/blob/v0.32.0/aws-creds/src/credentials.rs#L173 it looks like this should be supported.

This seems like it is caused by the bug described in https://github.com/durch/rust-s3/issues/286. Could you bump rust-s3 to 0.33.0 which included the fix?

ChrisLahaye commented 5 months ago

I only set CUBESTORE_AWS_CREDS_REFRESH_EVERY_MINS. The credentials used to be picked up automatically from the pod / instance. We have been running various versions of cube for the past year and this is the first time it has been a problem.

I am able to run v0.35.16 without issue.

We are experiencing the same issue although we get the following error,

2024-04-25T14:43:31.783Z ERROR [cubestore::cluster] <pid:1> Error: CubeError { message: "AWS S3 error: serde xml: custom: missing field `Name`", backtrace: "", cause: Internal }
ovr commented 5 months ago

Right now, it's not possible to use the 0.33 release because it has bugs At the same time, 0.34-rc has a problem with large file uploading because it doesn't control a number of parallels put(s), which causes high memory usage.

🫠

At the same time, the official SDK from AWS has problems.

@adamstruck

So, I backported the fix from https://github.com/durch/rust-s3/issues/286 in https://github.com/cube-js/cube/pull/8195/files

ovr commented 5 months ago

@adamstruck Could you give a try a latest release? https://github.com/cube-js/cube/releases/tag/v0.35.24

Ty

goncaloacteixeira commented 5 months ago

@ovr the issue still persists on my side, tested v0.35.24

2024-04-26T13:38:53.370Z INFO  [cubestored] <pid:1> Cube Store version 0.35.24
2024-04-26T13:38:53.376Z INFO  [cubestore::http::status] <pid:1> Serving status probes at 0.0.0.0:3031
thread 'main' panicked at /build/cubestore/cubestore/src/config/mod.rs:1816:86:
called `Result::unwrap()` on an `Err` value: CubeError { message: "Failed to create S3 credentials: attohttpc: Json Error: EOF while parsing a value at line 1 column 0", backtrace: "", cause: Internal }
stack backtrace:
   0: rust_begin_unwind
             at /rustc/190f4c96116a3b59b7de4881cfec544be0246d84/library/std/src/panicking.rs:647:5
   1: core::panicking::panic_fmt
             at /rustc/190f4c96116a3b59b7de4881cfec544be0246d84/library/core/src/panicking.rs:72:14
   2: core::result::unwrap_failed
             at /rustc/190f4c96116a3b59b7de4881cfec544be0246d84/library/core/src/result.rs:1649:5
   3: cubestore::config::injection::Injector::register::{{closure}}::{{closure}}::{{closure}}
   4: cubestore::config::injection::Injector::get_service::{{closure}}
   5: cubestore::config::injection::Injector::register::{{closure}}::{{closure}}::{{closure}}
   6: cubestore::config::injection::Injector::get_service::{{closure}}
   7: cubestore::config::injection::Injector::register_typed::{{closure}}::{{closure}}::{{closure}}
   8: cubestore::config::injection::Injector::get_service_typed::{{closure}}
   9: cubestore::config::injection::Injector::register_typed::{{closure}}::{{closure}}::{{closure}}
  10: cubestore::config::injection::Injector::get_service_typed::{{closure}}
  11: cubestore::config::injection::Injector::register_typed::{{closure}}::{{closure}}::{{closure}}
  12: cubestore::config::injection::Injector::get_service_typed::{{closure}}
  13: cubestored::main::{{closure}}
  14: tokio::runtime::park::CachedParkThread::block_on
  15: tokio::runtime::runtime::Runtime::block_on
  16: cubestored::main
note: Some details are omitted, run with `RUST_BACKTRACE=full` for a verbose backtrace.
ChrisLahaye commented 5 months ago

@ovr the issue persists here as well, although it it slightly different it was introduced at the same release.

024-04-26T13:38:47.733Z INFO  [cubestored] <pid:1> Cube Store version 0.35.24
2024-04-26T13:38:47.740Z INFO  [cubestore::http::status] <pid:1> Serving status probes at 0.0.0.0:3031
thread 'main' panicked at /build/cubestore/cubestore/src/config/mod.rs:1971:34:
called `Result::unwrap()` on an `Err` value: CubeError { message: "AWS S3 error: serde xml: custom: missing field `Name`", backtrace: "", cause: Internal }
stack backtrace:
   0: rust_begin_unwind
             at /rustc/190f4c96116a3b59b7de4881cfec544be0246d84/library/std/src/panicking.rs:647:5
   1: core::panicking::panic_fmt
             at /rustc/190f4c96116a3b59b7de4881cfec544be0246d84/library/core/src/panicking.rs:72:14
   2: core::result::unwrap_failed
             at /rustc/190f4c96116a3b59b7de4881cfec544be0246d84/library/core/src/result.rs:1649:5
   3: cubestore::config::injection::Injector::register_typed::{{closure}}::{{closure}}::{{closure}}
   4: cubestore::config::injection::Injector::get_service_typed::{{closure}}
   5: cubestore::config::injection::Injector::register_typed::{{closure}}::{{closure}}::{{closure}}
   6: cubestore::config::injection::Injector::get_service_typed::{{closure}}
   7: cubestore::config::injection::Injector::register_typed::{{closure}}::{{closure}}::{{closure}}
   8: cubestore::config::injection::Injector::get_service_typed::{{closure}}
   9: cubestored::main::{{closure}}
  10: tokio::runtime::park::CachedParkThread::block_on
  11: tokio::runtime::runtime::Runtime::block_on
  12: cubestored::main
note: Some details are omitted, run with `RUST_BACKTRACE=full` for a verbose backtrace.
adamstruck commented 5 months ago

@ovr I am still seeing errors with v0.35.24

2024-04-26T16:12:16.689Z INFO  [cubestored] <pid:1> Cube Store version 0.35.24
thread 'main' panicked at /build/cubestore/cubestore/src/config/mod.rs:1816:86:
called `Result::unwrap()` on an `Err` value: CubeError { message: "Failed to create S3 credentials: attohttpc: Json Error: expected value at line 1 column 1", backtrace: "", cause: Internal }
stack backtrace:
   0: rust_begin_unwind
             at /rustc/190f4c96116a3b59b7de4881cfec544be0246d84/library/std/src/panicking.rs:647:5
   1: core::panicking::panic_fmt
             at /rustc/190f4c96116a3b59b7de4881cfec544be0246d84/library/core/src/panicking.rs:72:14
   2: core::result::unwrap_failed
             at /rustc/190f4c96116a3b59b7de4881cfec544be0246d84/library/core/src/result.rs:1649:5
   3: cubestore::config::injection::Injector::register::{{closure}}::{{closure}}::{{closure}}
   4: cubestore::config::injection::Injector::get_service::{{closure}}
   5: cubestore::config::injection::Injector::register::{{closure}}::{{closure}}::{{closure}}
   6: cubestore::config::injection::Injector::get_service::{{closure}}
   7: cubestore::config::injection::Injector::register_typed::{{closure}}::{{closure}}::{{closure}}
   8: cubestore::config::injection::Injector::get_service_typed::{{closure}}
   9: cubestore::config::injection::Injector::register_typed::{{closure}}::{{closure}}::{{closure}}
  10: cubestore::config::injection::Injector::get_service_typed::{{closure}}
  11: cubestore::config::injection::Injector::register_typed::{{closure}}::{{closure}}::{{closure}}
  12: cubestore::config::injection::Injector::get_service_typed::{{closure}}
  13: cubestored::main::{{closure}}
  14: tokio::runtime::park::CachedParkThread::block_on
  15: tokio::runtime::runtime::Runtime::block_on
  16: cubestored::main
note: Some details are omitted, run with `RUST_BACKTRACE=full` for a verbose backtrace.
adamstruck commented 5 months ago

@ovr could the from_sts method just be reverted to what was in 0.26.3 (https://github.com/durch/rust-s3/blob/0.26.3/aws-creds/src/credentials.rs#L139-L141)?

adamstruck commented 4 months ago

I hacked around this issue by authenticating via STS in a separate process and storing the returned credentials in ~/.aws/credentials so that cubestore is able to use them.

paveltiunov commented 4 months ago

@adamstruck If it's still an issue, please feel free to provide PRs to https://github.com/cube-js/rust-s3 and to the Cube Store itself.

github-actions[bot] commented 4 months ago

If you are interested in working on this issue, please go ahead and provide PR for that. We'd be happy to review it and merge it. If this is the first time you are contributing a Pull Request to Cube, please check our contribution guidelines. You can also post any questions while contributing in the #contributors channel in the Cube Slack.

ovr commented 4 months ago

@adamstruck Could you give a try with v0.35.34?

Thanks

adamstruck commented 4 months ago

@ovr looks the the issue is fixed v0.35.34 - thank you for finding a fix for this!

ovr commented 4 months ago

@adamstruck Thank you for testing then I am going to close this issue.

danielli-ziprecruiter commented 4 months ago

I'm still experiencing this issue in v0.35.38 with the same error message in this comment.

lukethomas-acsg commented 3 months ago

Hi everyone, I've just been doing a bit of diagnosis on this. I think what might be happening here is that AWS STS credentials aren't actually enabled in release builds of the rust-s3 library.

Looking at the following line, it doesn't look like the "http-credentials" feature is enabled: https://github.com/cube-js/rust-s3/blob/cubestore-0.32.3-backport/rust-s3/Cargo.toml#L21

However, it actually is enabled in the dev-dependencies section of the same file (https://github.com/cube-js/rust-s3/blob/cubestore-0.32.3-backport/rust-s3/Cargo.toml#L75).

I suspect that the error message in this comment actually isn't related to getting credentials at all but is produced because the instance that the container is running on has IMDSv1 enabled. This allows the rust-s3 library to retrieve credentials from the underlying instance. The role associated with these credentials likely doesn't have the correct S3 permissions and as a result, we get this error message.

This might be as simple to fix as enabling the "http-credentials" feature on the "aws-creds" dependency in the "rust-s3" crate but I don't have a good enough knowledge of the project as a whole to know whether this is all that would need to be done.

Graphmaxer commented 1 month ago

Hello, for me it's working since v0.35.34 even on v0.35.38, but since 0.35.67, i have a new error related to certificates :

2024-08-09T15:06:39.214Z INFO  [cubestored] <pid:1> Cube Store version 0.35.67
2024-08-09T15:06:39.215Z DEBUG [cubestored] <pid:1> New process started
2024-08-09T15:06:39.215Z TRACE [cubestore::telemetry] <pid:1> agent endpoint url: None
2024-08-09T15:06:39.219Z INFO  [cubestore::http::status] <pid:1> Serving status probes at 0.0.0.0:3031
2024-08-09T15:06:39.436Z DEBUG [cubestore::remotefs::s3] <pid:1> Started S3 credentials refresh loop
thread 'main' panicked at /build/cubestore/cubestore/src/config/mod.rs:1982:34:
called `Result::unwrap()` on an `Err` value: CubeError { message: "AWS S3 error: reqwest: error sending request for url (https://***.s3-***.amazonaws.com/?prefix=metastore-current&list-type=2): error trying to connect: error:16000069:STORE routines:ossl_store_get0_loader_int:unregistered scheme:../crypto/store/store_register.c:237:scheme=file, error:80000002:system library:file_open:reason(2):../providers/implementations/storemgmt/file_store.c:267:calling stat(/usr/lib/ssl/certs), error:16000069:STORE routines:ossl_store_get0_loader_int:unregistered scheme:../crypto/store/store_register.c:237:scheme=file, error:80000002:system library:file_open:reason(2):../providers/implementations/storemgmt/file_store.c:267:calling stat(/usr/lib/ssl/certs), error:16000069:STORE routines:ossl_store_get0_loader_int:unregistered scheme:../crypto/store/store_register.c:237:scheme=file, error:80000002:system library:file_open:reason(2):../providers/implementations/storemgmt/file_store.c:267:calling stat(/usr/lib/ssl/certs), error:16000069:STORE routines:ossl_store_get0_loader_int:unregistered scheme:../crypto/store/store_register.c:237:scheme=file, error:80000002:system library:file_open:reason(2):../providers/implementations/storemgmt/file_store.c:267:calling stat(/usr/lib/ssl/certs), error:16000069:STORE routines:ossl_store_get0_loader_int:unregistered scheme:../crypto/store/store_register.c:237:scheme=file, error:80000002:system library:file_open:reason(2):../providers/implementations/storemgmt/file_store.c:267:calling stat(/usr/lib/ssl/certs), error:0A000086:SSL routines:tls_post_process_server_certificate:certificate verify failed:../ssl/statem/statem_clnt.c:1889: (unable to get local issuer certificate)", backtrace: "", cause: Internal }
stack backtrace:

Maybe it's related to this small change https://github.com/cube-js/cube/pull/8554/files It's working on v0.35.66, thanks for the help

Graphmaxer commented 1 month ago

Fixed on the v0.35.68, thanks :) https://github.com/cube-js/cube/pull/8571