logstash-plugins / logstash-input-s3

Apache License 2.0
57 stars 150 forks source link

Logstash S3 input plugin assume role not working #213

Open christiangda opened 4 years ago

christiangda commented 4 years ago

Hi,

I'm trying to use the assume role functionality with logstash S3 input plugin but I get the following error:

NOTE: Looks like the plugin is not assuming the role, I can't see any trace about assume a role

[2020-07-20T07:18:46,508][ERROR][logstash.inputs.s3       ][main][790d495ae7a1e587d317915855ea5c21d64f412fed2b6c1bb7abb425f681f82f] 
Unable to list objects in bucket {:exception=>Aws::S3::Errors::AccessDenied, :message=>"Access Denied", 
:backtrace=>["/usr/share/logstash/vendor/bundle/jruby/2.5.0/gems/aws-sdk-core-2.11.501/lib/seahorse/client/plugins/raise_response_errors.rb:15:in 
`call'", "/usr/share/logstash/vendor/bundle/jruby/2.5.0/gems/aws-sdk-core-2.11.501/lib/aws-sdk-core/plugins/s3_sse_cpk.rb:19:in 
`call'", "/usr/share/logstash/vendor/bundle/jruby/2.5.0/gems/aws-sdk-core-2.11.501/lib/aws-sdk-core/plugins/s3_dualstack.rb:24:in 
`call'", "/usr/share/logstash/vendor/bundle/jruby/2.5.0/gems/aws-sdk-core-2.11.501/lib/aws-sdk-core/plugins/s3_accelerate.rb:34:in 
`call'", "/usr/share/logstash/vendor/bundle/jruby/2.5.0/gems/aws-sdk-core-2.11.501/lib/aws-sdk-core/plugins/jsonvalue_converter.rb:20:in 
`call'", "/usr/share/logstash/vendor/bundle/jruby/2.5.0/gems/aws-sdk-core-2.11.501/lib/aws-sdk-core/plugins/idempotency_token.rb:18:in 
`call'", "/usr/share/logstash/vendor/bundle/jruby/2.5.0/gems/aws-sdk-core-2.11.501/lib/aws-sdk-core/plugins/param_converter.rb:20:in 
`call'", "/usr/share/logstash/vendor/bundle/jruby/2.5.0/gems/aws-sdk-core-2.11.501/lib/aws-sdk-core/plugins/response_paging.rb:26:in 
`call'", "/usr/share/logstash/vendor/bundle/jruby/2.5.0/gems/aws-sdk-core-2.11.501/lib/seahorse/client/plugins/response_target.rb:21:in 
`call'", "/usr/share/logstash/vendor/bundle/jruby/2.5.0/gems/aws-sdk-core-2.11.501/lib/seahorse/client/request.rb:70:in 
`send_request'", "/usr/share/logstash/vendor/bundle/jruby/2.5.0/gems/aws-sdk-core-2.11.501/lib/seahorse/client/base.rb:207:in 
`block in define_operation_methods'", "/usr/share/logstash/vendor/bundle/jruby/2.5.0/gems/aws-sdk-resources-2.11.501/lib/aws-sdk-resources/request.rb:24:in 
`call'", "/usr/share/logstash/vendor/bundle/jruby/2.5.0/gems/aws-sdk-resources-2.11.501/lib/aws-sdk-resources/operations.rb:139:in 
`all_batches'", "org/jruby/RubyEnumerator.java:396:in 
`each'", "/usr/share/logstash/vendor/bundle/jruby/2.5.0/gems/aws-sdk-resources-2.11.501/lib/aws-sdk-resources/collection.rb:18:in 
`each'", "/usr/share/logstash/vendor/bundle/jruby/2.5.0/gems/logstash-input-s3-3.5.0/lib/logstash/inputs/s3.rb:132:in 
`list_new_files'", "/usr/share/logstash/vendor/bundle/jruby/2.5.0/gems/logstash-input-s3-3.5.0/lib/logstash/inputs/s3.rb:172:in 
`process_files'", "/usr/share/logstash/vendor/bundle/jruby/2.5.0/gems/logstash-input-s3-3.5.0/lib/logstash/inputs/s3.rb:123:in 
`block in run'", "/usr/share/logstash/vendor/bundle/jruby/2.5.0/gems/stud-0.0.23/lib/stud/interval.rb:20:in 
`interval'", "/usr/share/logstash/vendor/bundle/jruby/2.5.0/gems/logstash-input-s3-3.5.0/lib/logstash/inputs/s3.rb:122:in 
`run'", "/usr/share/logstash/logstash-core/lib/logstash/java_pipeline.rb:345:in `inputworker'", 
"/usr/share/logstash/logstash-core/lib/logstash/java_pipeline.rb:336:in `block in start_input'"], :prefix=>nil}

I have two AWS account, the first one only contains AWS IAM credentials and users, the second one has the S3 buckets.

Account A

Here I have an IAM programmatic user which inside a Group with a policyto assume a role to account b

access_key_id
secret_access_key

Policy:

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Sid": "AssumeRoleProd",
      "Effect": "Allow",
      "Action": "sts:AssumeRole",
      "Resource": [
        "arn:aws:I am::<my accounted removed, account b id>:role/<removed role name>"
      ]
    }
  ]
}

Account b

Here I have one bucket with data logs and a role to be assumed with access to this bucket

S3://mybucket

Role:

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Sid": "VisualEditor0",
      "Effect": "Allow",
      "Action": [
        "s3:GetObject",
        "s3:ListBucket"
      ],
      "Resource": [
        "arn:aws:s3:::mybucket/*",
        "arn:aws:s3:::mybucket"
      ]
    }
  ]
}

As I mentioned before, looks like the plugins in not assuming the role.

NOTE: If I create credentials directly into the account b, the pluging work fine, what I mean, this works when do not need to assume a role, with a conf like:

          input {
            s3 {
              access_key_id => "account b credentials"
              secret_access_key =>"account b credentials"
              #role_arn => "arn:aws:I am::<my accounted removed>:role/<removed role name>" 
              #role_session_name => "logstash_from_<removed information here>"
              bucket => "aleaplay.events.dynamo.i.player"
              #prefix => "2020/07/17/16" # not necessary, without it read all
              region => "eu-west-1"
              interval => 60
              gzip_pattern => "\.gz(ip)?$"
              additional_settings => {
                force_path_style => true
                follow_redirects => false
              }
            }
          }

Please let me know if I'm doing something wrong with this plugin, or if I have left some configuration off.

Environment information

bash-4.2$ logstash-plugin list --verbose --installed

inside the container

... logstash-input-s3 (3.5.0) ...

- Operating System:
  * CentOS 8 (podman container, docker.elastic.co/logstash/logstash-oss:7.8.0)
```bash
# inside the container
bash-4.2$ cat /etc/redhat-release
CentOS Linux release 7.8.2003 (Core)

log.level: debug

bash-4.2$ cat pipeline/logstash.conf

Ansible managed

input { s3 { access_key_id => "removed" secret_access_key => "removed" role_arn => "arn:aws:I am:::role/" role_session_name => "logstashfrom" bucket => ""

prefix => "2020/07/17/16" # not necessary, without it read all

region => "eu-west-1"
interval => 60
gzip_pattern => "\.gz(ip)?$"
additional_settings => {
  force_path_style => true
  follow_redirects => false
}

} }

output { elasticsearch { ilm_enabled => false hosts => ["https://:9200"] index => "-%{+YYYY.MM.dd}" user => "" password => "" ssl => true ssl_certificate_verification => false cacert => "/usr/share/logstash/config/root-ca.pem" } }

- Sample Data:
NA
- Steps to Reproduce:

```bash
podman run -d \
    --name=logstash-01 \
    --net=odfe \
    --hostname=logstash-01 \
    --privileged \
    --ulimit=host \
    --security-opt label=disable \
    --volume {{ logstash_host_volume_conf_path }}:/usr/share/logstash/config:ro \
    --volume {{ logstash_host_volume_pipeline_path }}:/usr/share/logstash/pipeline:ro \
    --volume {{ logstash_host_volume_data_path }}:/usr/share/logstash/data:rw \
    --volume {{ logstash_host_volume_logs_path }}:/usr/share/logstash/logs:rw \
    --cpus 1 \
    --memory 1g \
    --memory-reservation 512m \
    --memory-swap 1g \
  docker.elastic.co/logstash/logstash-oss:7.8.0 bash -c "bin/logstash-plugin install logstash-input-s3 && bin/logstash"
cabberley commented 4 years ago

It would appear from the source code that the assumed role may only work if logstash is running on an AWS ec2 and your using the identity assigned to the instance and also not populating the access key and secret options, only providing a assumed role and session name.

The code requires changes to use a different identity for an assumed role and also would then work on a non AWS hosted server.

christiangda commented 4 years ago

Hi @cabberley , thanks for your comment.

When you are working into the EC2 instance, you don't need to assume the role of the instance, this is automatically implemented into the AWS SDK.

The common behaviour is to use assume-role when you are operating cross-account, where you use the actual credentials to call sts API and create temporary credentials into the second account.

We can see it into the docs Creating an AWS STS Access Token

Of course, could be a case where you are operating into the EC2 instance and needs to operate cross-account also.

For me, if you pass the AWS access_key_id, secret_access_key and role_arn is because you are going to use the two firsts to call AWS STS API to assume the role role_arn and generate new credentials like show the Creating an AWS STS Access Token

cabberley commented 4 years ago

Hi @christiangda I may not have explained very well. Your comments are correct.

What I am trying to say is that the way the s3 plugin code has been written, if you you supply access_key_id, secret_access_key in the .conf file the code will never do the assumeRole with the role_arn you provide. It will only use role_arn and execute the Assumerole if the .conf file only has role_arn.

The code which is the problem for us is actually part of logstash-mixin-aws not this plugin

The logic in the code says IF access_key_id and secret_access key is provided then use them to authenticate ELSIF credentials are in a YML file read only access_key_id and secret_access key and authenticate ELSIF role_ARN is provide then do assumerole and use the ec2 identity as the Identity authorized to use the ARN_ROLE. END

Which means if you provide as you want all 3 values, it will never do the assume role.

I made my own version of the plugin which changed the logic to cater for your scenario

IF access_key_id and secret_access key is provided then use them to authenticate ELSIF credentials are in a YML file read only access_key_id and secret_access key and authenticate END IF role_ARN is provide then do assumerole using the access_key_id and secret_access key provided in the .conf file. END

Mine also caters for using external_id which is also a parameter that AssumeRole sometimes requires depending on how the identity has been setup. addiing external_id does require a few other code changes to Logstash-mixin-aws for it to work. But Doesn't require changes to the plugins that rely on logstash-mixin-aws. I use the s3 input plugin and the cloudwatch plugin which rely on this code.

vjagannath786 commented 4 years ago

I am also facing the same issue. Can you please let me know how to solve this problem. I have installed logstash in on-prem server.

cabberley commented 4 years ago

If you don’t need to use an external Id with your assume role arn, you can install the aws cli on the server use the aws credentials to setup the aws default profile for the primary account. In your logstash config do not put an access key and secret key leave them out, just put the role arn in the config and a sessionname. The plugin will then use the default profile that you setup with cli to present to aws to get the assumed role in return. The code for this is actually in logstash-mixin-aws not the this plugin, this plugin has a dependency on it. If you need to use external Id I have open pull request for an enhanced logstash-mixin-plugin that enhances assume role function to use an external Id as well.

vjagannath786 commented 4 years ago

Thanks alot brother.... It worked!!!

vjagannath786 commented 3 years ago

@cabberley, can you help me with this error. Just wanted to know when does this occur. Is it because of configuration issue or permissions at s3 bucket. And, I was able to list the objects through awscli.

sukhbir-singh commented 3 years ago

We are still facing this issue? Any updates on this ?

niraj8241 commented 3 years ago

One of the ways I tried fixing this issue was to export Access Keys as environment variables and then start logstash