open-telemetry / opentelemetry-collector-contrib

Contrib repository for the OpenTelemetry Collector
https://opentelemetry.io
Apache License 2.0
2.9k stars 2.27k forks source link

AWS S3 WebIdentityErr: failed to retrieve credentials by using IAM Role #34922

Open susikanth opened 2 weeks ago

susikanth commented 2 weeks ago

Describe Issue: I am encountering the following error while attempting to push logs to S3 buckets using an IAM role with the awss3 exporter in the OpenTelemetry Collector chart.

Chart: opentelemetry-collector Version: 0.101.2 **Image:** otel/opentelemetry-collector-contrib: otel/opentelemetry-collector-contrib:0.106.1

Error message: _*error exporterhelper/common.go:296 Exporting failed. Rejecting data. {"kind": "exporter", "data_type": "logs", "name": "awss3", "error": "WebIdentityErr: failed to retrieve credentials\ncaused by: SerializationError: failed to unmarshal error message\n\tstatus code: 405, request id: \ncaused by: UnmarshalError: failed to unmarshal error message\n\t00000000 3c 3f 78 6d 6c 20 76 65 72 73 69 6f 6e 3d 22 31 |<?xml version=\"1|\n00000010 2e 30 22 20 65 6e 63 6f 64 69 6e 67 3d 22 55 54 |.0\" encoding=\"UT|\n00000020 46 2d 38 22 3f 3e 0a 3c 45 72 72 6f 72 3e 3c 43 |F-8\"?>.<C|\n00000030 6f 64 65 3e 4d 65 74 68 6f 64 4e 6f 74 41 6c 6c |ode>MethodNotAll|\n00000040 6f 77 65 64 3c 2f 43 6f 64 65 3e 3c 4d 65 73 73 |owed<Mess|\n00000050 61 67 65 3e 54 68 65 20 73 70 65 63 69 66 69 65 |age>The specifie|\n00000060 64 20 6d 65 74 68 6f 64 20 69 73 20 6e 6f 74 20 |d method is not |\n00000070 61 6c 6c 6f 77 65 64 20 61 67 61 69 6e 73 74 20 |allowed against |\n00000080 74 68 69 73 20 72 65 73 6f 75 72 63 65 2e 3c 2f |this resource.</|\n00000090 4d 65 73 73 61 67 65 3e 3c 4d 65 74 68 6f 64 3e |Message>|\n000000a0 50 4f 53 54 3c 2f 4d 65 74 68 6f 64 3e 3c 52 65 |POST<Re|\n000000b0 73 6f 75 72 63 65 54 79 70 65 3e 53 45 52 56 49 |sourceType>SERVI|\n000000c0 43 45 3c 2f 52 65 73 6f 75 72 63 65 54 79 70 65 |CE</ResourceType|\n000000d0 3e 3c 52 65 71 75 65 73 74 49 64 3e 5a 59 51 58 |>ZYQX|\n000000e0 42 31 50 57 38 38 39 37 53 39 50 59 3c 2f 52 65 |B1PW8897S9PY</Re|\n000000f0 71 75 65 73 74 49 64 3e 3c 48 6f 73 74 49 64 3e |questId>|\n00000100 69 5a 57 63 2f 35 32 75 56 53 44 6e 37 39 33 73 |iZWc/52uVSDn793s|\n00000110 45 41 2f 45 46 61 30 79 55 39 31 36 7a 57 59 59 |EA/EFa0yU916zWYY|\n00000120 69 65 64 42 32 30 30 48 4b 41 31 6d 33 6d 41 4d |iedB200HKA1m3mAM|\n00000130 51 4c 45 37 6b 2b 47 77 54 48 7a 6d 78 6e 41 41 |QLE7k+GwTHzmxnAA|\n00000140 59 33 74 62 45 48 51 31 48 49 6f 3d 3c 2f 48 6f |Y3tbEHQ1HIo=</Ho|\n00000150 73 74 49 64 3e 3c 2f 45 72 72 6f 72 3e |stId>|\n\ncaused by: unknown error response tag, {{ Error} []}", "rejected_items": 1} go.opentelemetry.io/collector/exporter/exporterhelper.(baseExporter).send go.opentelemetry.io/collector/exporter@v0.107.0/exporterhelper/common.go:296 go.opentelemetry.io/collector/exporter/exporterhelper.NewLogsRequestExporter.func1 go.opentelemetry.io/collector/exporter@v0.107.0/exporterhelper/logs.go:134 go.opentelemetry.io/collector/consumer.ConsumeLogsFunc.ConsumeLogs go.opentelemetry.io/collector/consumer@v0.107.0/logs.go:26 go.opentelemetry.io/collector/internal/fanoutconsumer.(logsConsumer).ConsumeLogs go.opentelemetry.io/collector@v0.107.0/internal/fanoutconsumer/logs.go:73 go.opentelemetry.io/collector/processor/processorhelper.NewLogsProcessor.func1 go.opentelemetry.io/collector/processor@v0.107.0/processorhelper/logs.go:56 go.opentelemetry.io/collector/consumer.ConsumeLogsFunc.ConsumeLogs go.opentelemetry.io/collector/consumer@v0.107.0/logs.go:26 go.opentelemetry.io/collector/consumer.ConsumeLogsFunc.ConsumeLogs go.opentelemetry.io/collector/consumer@v0.107.0/logs.go:26 go.opentelemetry.io/collector/internal/fanoutconsumer.(logsConsumer).ConsumeLogs go.opentelemetry.io/collector@v0.107.0/internal/fanoutconsumer/logs.go:64 github.com/open-telemetry/opentelemetry-collector-contrib/internal/coreinternal/consumerretry.(logsConsumer).ConsumeLogs github.com/open-telemetry/opentelemetry-collector-contrib/internal/coreinternal@v0.107.0/consumerretry/logs.go:66 github.com/open-telemetry/opentelemetry-collector-contrib/pkg/stanza/adapter.(receiver).consumerLoop github.com/open-telemetry/opentelemetry-collector-contrib/pkg/stanza@v0.107.0/adapter/receiver.go:126**_

And more details are provided in the error. Configuring Manifest:

mode: daemonset image: repository: "m2pfintech01/opentelemetry" tag: "opentelemetry-collector-contrib-v0.106.1" imagePullSecrets:

presets: logsCollection: enabled: true includeCollectorLogs: false

kubernetesAttributes: enabled: true extractAllPodLabels: true

config: exporters: otlp: endpoint: a77825710dac34c8da25e7654-0ff98df6a618fa74.elb.ap-south-1.amazonaws.com:4317 tls: insecure: true headers: "Authorization": "Basic dXNlcm5hbWUxOnBhc3N3b3JkMQ==" awss3: s3uploader: region: "ap-south-1" s3_bucket: "bucket" s3_prefix: "agentsss" s3_partition: "minute" disable_ssl: true endpoint: "s3.ap-south-1.amazonaws.com" role_arn: "arn:aws:iam::arn" s3_force_path_style: true receivers: filelog: include:

The AWS IAM role has sufficient privileges to push the logs. I have tested this with another service using the same role ARN, and it worked fine. Hence, it seems to be an issue at the OpenTelemetry Collector level. Could someone assist me with resolving this issue?

dosubot[bot] commented 2 weeks ago

Hey @susikanth! I'm here to help with any bugs, questions, or contributions you have. Let's squash those bugs together!

The error message indicates that the specified method is not allowed against the S3 resource. This typically happens when the IAM role does not have the correct permissions or when the S3 bucket policy does not allow the required actions.

Here are a few steps to troubleshoot and resolve the issue:

  1. Check IAM Role Permissions: Ensure that the IAM role has the necessary permissions to perform the required S3 actions. The role should have at least the following permissions:

    {
      "Version": "2012-10-17",
      "Statement": [
        {
          "Effect": "Allow",
          "Action": [
            "s3:PutObject",
            "s3:PutObjectAcl",
            "s3:GetObject",
            "s3:ListBucket"
          ],
          "Resource": [
            "arn:aws:s3:::bucket",
            "arn:aws:s3:::bucket/*"
          ]
        }
      ]
    }
  2. Verify Bucket Policy: Ensure that the S3 bucket policy allows the necessary actions from the IAM role. The bucket policy should look something like this:

    {
      "Version": "2012-10-17",
      "Statement": [
        {
          "Effect": "Allow",
          "Principal": {
            "AWS": "arn:aws:iam::account-id:role/role-name"
          },
          "Action": [
            "s3:PutObject",
            "s3:PutObjectAcl",
            "s3:GetObject",
            "s3:ListBucket"
          ],
          "Resource": [
            "arn:aws:s3:::bucket",
            "arn:aws:s3:::bucket/*"
          ]
        }
      ]
    }
  3. Check Web Identity Token: Ensure that the web identity token is correctly configured and valid. The error might be due to an issue with the token used to assume the IAM role.

  4. Endpoint Configuration: Verify that the endpoint configuration is correct. The endpoint should match the region and service you are trying to access.

  5. Disable SSL: Since you have disable_ssl: true in your configuration, ensure that this is intentional and that your endpoint supports non-SSL connections.

  6. Role ARN Configuration: Ensure that the role_arn specified in the configuration is correct and that the role can be assumed by the service account running the OpenTelemetry Collector.

If all the above configurations are correct and the issue persists, you might want to enable more detailed logging in the OpenTelemetry Collector to get more insights into the failure.

To continue talking to Dosu, mention @dosu.

susikanth commented 2 weeks ago

I hava provided s3 full access with this role.However, encoutering the same issue *{ "Version": "2012-10-17", "Statement": [ { "Effect": "Allow", "Action": [ "s3:", "s3-object-lambda:" ], "Resource": "" } ] }**

dosubot[bot] commented 2 weeks ago

Could you provide more details about the web identity token configuration? Specifically, how is the token being generated and passed to the OpenTelemetry Collector? Additionally, can you confirm if the endpoint configuration is correct and matches the region and service you are trying to access?

To continue talking to Dosu, mention @dosu.

susikanth commented 2 weeks ago

Web Identity Token Generation:

The Web Identity Token is typically generated by AWS IAM roles that are assigned to Kubernetes service accounts, particularly when running in an EKS (Elastic Kubernetes Service) environment. The token is provided via a service account with the appropriate IAM role annotation, such as: yaml Copy code

serviceAccount:
  create: true
  annotations:
    eks.amazonaws.com/role-arn: arn:aws:iam::123456789012:role/your-role-name

Kubernetes mounts a token in the service account's pod, and this token is used by the OpenTelemetry Collector to assume the IAM role and authenticate with AWS services (like S3).

Passing the Web Identity Token to OpenTelemetry Collector:

The OpenTelemetry Collector automatically retrieves the web identity token from the pod's environment if properly configured with the IAM role. Ensure that the AWS SDK inside the otel/opentelemetry-collector-contrib image can access the token at the expected location (/var/run/secrets/eks.amazonaws.com/serviceaccount/token). No additional configuration is needed if the role is correctly annotated and the IAM permissions are properly set up.

Validating the Endpoint Configuration:

The endpoint configuration in your awss3 exporter should be specific to the AWS S3 service in the region you are targeting. Since you're using the ap-south-1 region, the endpoint should typically be: yaml Copy code endpoint: "s3.ap-south-1.amazonaws.com" Ensure that this endpoint matches the correct S3 region for the bucket you're using. If using a VPC endpoint or custom S3-compatible service, validate that the custom endpoint URL is correct.

Key things to check:

The region parameter in the awss3 exporter should also match the region where your bucket resides (ap-south-1). The role used should have the required permissions for S3 actions (e.g., s3:PutObject, s3:GetObject, etc.).

Common Issues:

If the token isn't being properly fetched or the IAM role isn't correctly assumed, you may see errors like WebIdentityErr: failed to retrieve credentials. Ensure that the OpenTelemetry Collector pod has the correct serviceAccount annotations and that the role has policies attached for S3 access. Let me know if you'd like more details or specific troubleshooting steps for any of these areas.