fluent / fluent-bit

Fast and Lightweight Logs and Metrics processor for Linux, BSD, OSX and Windows
https://fluentbit.io
Apache License 2.0
5.73k stars 1.56k forks source link

Unable to STS assume Role in fluentbit/S3 plugin #8875

Closed sabdalla80 closed 1 week ago

sabdalla80 commented 3 months ago

I am am unable to assume a role all of a sudden from my daemonset/EKS application. The fluentbit application is trying to assume a role in a different account so it can write the logs to a bucket there. I am seeing this error recently without knowing what changed to cause this error. I appreciate any feedback on this. fluentBitChartVersion=0.46.7 fluentBitImageRepo=fluent/fluent-bit fluentBitImageTag=2.2.0 fluentBitChartName=fluent-bit

My output:

[OUTPUT] Name s3 Match internal.* region us-east-2 bucket customer-logs-bucket-us-east-2 external_id someId role_arn arn:aws:iam::account2:role/roletoassume sts_endpoint https://sts.us-east-2.amazonaws.com store_dir /tmp/fluent-bit/s3-kube retry_limit 10 total_file_size 15M upload_timeout 15s store_dir_limit_size 50M s3_key_format $TAG-$UUID s3_key_format_tagdelimiters . compression gzip

The error from logs with debug on:

[2024/05/26 19:18:24] [ info] [filter:kubernetes:kubernetes.3] connectivity OK [2024/05/26 19:18:24] [ info] [input:emitter:tag_for_s3] initializing [2024/05/26 19:18:24] [ info] [input:emitter:tag_for_s3] storage_strategy='memory' (memory only) [2024/05/26 19:18:24] [ info] [input:emitter:tag_for_kube] initializing [2024/05/26 19:18:24] [ info] [input:emitter:tag_for_kube] storage_strategy='memory' (memory only) [2024/05/26 19:18:24] [ info] [fstore] created root path /tmp/fluent-bit/s3-kube/customer-logs-bucket-us-east-2 [2024/05/26 19:18:24] [ info] [output:s3:s3.0] Using upload size 15000000 bytes [2024/05/26 19:18:24] [ info] [aws_client] auth error, refreshing creds [2024/05/26 19:18:24] [error] [aws_credentials] Shared credentials file /root/.aws/credentials does not exist [2024/05/26 19:18:24] [ info] [output:s3:s3.0] worker #0 started [2024/05/26 19:18:24] [ info] [http_server] listen iface=0.0.0.0 tcp_port=2020

[2024/05/26 19:19:42] [ info] [aws_client] auth error, refreshing creds [2024/05/26 19:19:42] [error] [aws_credentials] Shared credentials file /root/.aws/credentials does not exist [2024/05/26 19:19:42] [error] [aws_credentials] STS assume role request failed [2024/05/26 19:19:42] [ warn] [aws_credentials] No cached credentials are available and a credential refresh is already in progress. The current co-routine will retry. [2024/05/26 19:19:42] [error] [signv4] Provider returned no credentials, service=s3 [2024/05/26 19:19:42] [error] [aws_client] could not sign request [2024/05/26 19:19:42] [error] [aws_credentials] STS assume role request failed

sabdalla80 commented 3 months ago

@PettitWesley I am seeing the same issue as this one (Fluent Bit 1.6 - ES Plugin: Failed to source credential on Amazon EKS IAM Roles for Service Account #2714). Could the bug have been re-introduced? I am able to send to S3, but not able to assume the role.

Here is another snippet of debug outputs

[2024/05/26 20:19:48] [debug] [upstream] KA connection #77 to s3.us-east-2.amazonaws.com:443 has been assigned (recycled) [2024/05/26 20:19:48] [debug] [http_client] not using http_proxy for header [2024/05/26 20:19:48] [debug] [aws_credentials] Requesting credentials from the STS provider.. [2024/05/26 20:19:48] [debug] [aws_credentials] STS Provider: Refreshing credential cache. [2024/05/26 20:19:48] [debug] [aws_credentials] Calling STS.. [2024/05/26 20:19:48] [debug] [upstream] KA connection #343 to sts.us-east-2.amazonaws.com:443 has been assigned (recycled) [2024/05/26 20:19:48] [debug] [http_client] not using http_proxy for header [2024/05/26 20:19:48] [debug] [aws_credentials] Requesting credentials from the EKS provider.. [2024/05/26 20:19:48] [debug] [task] destroy task=0x7fd750366f00 (task_id=0) [2024/05/26 20:19:48] [debug] [task] created task=0x7fd750366f00 id=0 without routes, dropping. [2024/05/26 20:19:48] [debug] [task] destroy task=0x7fd750366f00 (task_id=0) [2024/05/26 20:19:48] [debug] [task] created task=0x7fd750366f00 id=0 without routes, dropping.

[2024/05/26 20:19:48] [debug] [task] destroy task=0x7fd750366f00 (task_id=0) [2024/05/26 20:19:48] [debug] [aws_client] sts.us-east-2.amazonaws.com: http_do=0, HTTP Status: 403 [2024/05/26 20:19:48] [debug] [upstream] KA connection #343 to sts.us-east-2.amazonaws.com:443 is now available [2024/05/26 20:19:48] [debug] [aws_client] Unable to parse API response- response is not valid JSON. [2024/05/26 20:19:48] [debug] [aws_credentials] STS raw response:

PettitWesley commented 3 months ago

Use latest or latest stable version

github-actions[bot] commented 1 week ago

This issue is stale because it has been open 90 days with no activity. Remove stale label or comment or this will be closed in 5 days. Maintainers can add the exempt-stale label.

github-actions[bot] commented 1 week ago

This issue was closed because it has been stalled for 5 days with no activity.