elastic / elasticsearch-cloud-aws

AWS Cloud Plugin for Elasticsearch
https://github.com/elastic/elasticsearch/tree/master/plugins/discovery-ec2
577 stars 181 forks source link

base_path appears to be ignored on S3 bucket #230

Closed jsok closed 8 years ago

jsok commented 8 years ago

Hi there,

I'm running ES 1.6.6 and version 2.6 of this plugin.

My elasticsearch.yml repositories config looks like:

  repositories:
    s3:
      region: "ap-southeast-2"
      bucket: "my-bucket-name"
      base_path: "staging"

When I restrict my IAM policy to only my-bucket-name/staging/* (using the policies in the plugin README), I get the error:

curl -s -XPOST localhost:9200/_snapshot/my-bucket-name/_verify | jq \.
{
  "status": 500,
  "error": "RemoteTransportException[[ip-172-31-3-172-search][inet[/172.31.3.172:9300]][cluster:admin/repository/verify]]; nested: NotSerializableTransportException[[org.elasticsearch.repositories.RepositoryVerificationException] [my-bucket-name] path  is not accessible on master node; Unable to upload object tests-SySrhtH9RjyUA3ja1pFyWg-master; Access Denied (Service: Amazon S3; Status Code: 403; Error Code: AccessDenied; Request ID: 6Axxxxxxx6BF3460); ]; "
}

My suspicion for base_path being ignored is the error message from https://github.com/elastic/elasticsearch/blob/master/core/src/main/java/org/elasticsearch/repositories/blobstore/BlobStoreRepository.java#L636 The error message calls basePath() however when it's printed above it's an empty string:

[my-bucket-name] path   is not accessible on master node

If base_path was being setup correctly I would expect the error message to say:

[my-bucket-name] path staging is not accessible on master node

If I loosen my IAM policy to the whole bucket, it happily verifies and creates snapshots but in the root of the bucket (ignoring my base_path)!

I can't find any docs or code to suggest my config is incorrect. Any help would be appreciated, thanks!

jsok commented 8 years ago

My IAM Policy for reference:

{
    "Statement": [
        {
            "Resource": [
                "arn:aws:s3:::my-bucket-name"
            ],
            "Action": [
                "s3:ListBucket",
                "s3:GetBucketLocation",
                "s3:ListBucketMultipartUploads",
                "s3:ListBucketVersions"
            ],
            "Effect": "Allow"
        },
        {
            "Resource": [
                "arn:aws:s3:::my-bucket-name/staging/*"
            ],
            "Action": [
                "s3:GetObject",
                "s3:PutObject",
                "s3:DeleteObject",
                "s3:AbortMultipartUpload",
                "s3:ListMultipartUploadParts"
            ],
            "Effect": "Allow"
        }
    ],
    "Version": "2012-10-17"
}

I can sanity check the policy by using AWS CLI to successfully upload files:

aws s3 cp test s3://my-bucket-name/staging/test
jsok commented 8 years ago

And listing the snapshots shows no base_path in the settings object:

# curl -s localhost:9200/_snapshot/_all | jq \.
{
  "my-bucket-name": {
    "settings": {
      "bucket": "my-bucket-name",
      "region": "ap-southeast-2"
    },
    "type": "s3"
  }
}

If I add a snapshot manually via PUT /_snapshot/my-bucket-name-manual and include base_path everything works as expected:

# curl -s localhost:9200/_snapshot/_all | jq \.
{
  "my-bucket-name-manual": {
    "settings": {
      "bucket": "my-bucket-name",
      "base_path": "staging",
      "region": "ap-southeast-2"
    },
    "type": "s3"
  },
  "my-bucket-name": {
    "settings": {
      "bucket": "my-bucket-name",
      "region": "ap-southeast-2"
    },
    "type": "s3"
  }
}

Which narrows down this issue -- how do you properly specify base_path in elasticsearch.yml?

dadoonet commented 8 years ago

It's not supported yet.

I guess we need to change this line

from:

        String basePath = repositorySettings.settings().get("base_path", null);

to

        String basePath = repositorySettings.settings().get("base_path", settings.get("repositories.s3.base_path", null));
jsok commented 8 years ago

Great news, I can open a PR for the fix if that solution is acceptable?

dadoonet commented 8 years ago

@jsok Added a PR here: https://github.com/elastic/elasticsearch/pull/12761

As there is a workaround for now (I mean setting the path in the repository itself), I think we can close your issue?

jsok commented 8 years ago

Sure, thanks for that! What's the release process for this plugin, I.e. How can I get the fix installed on my servers?

dadoonet commented 8 years ago

Well. I created the PR for elasticsearch 2.0 only so depending on when it will be merged you might be able to use it with elasticsearch 2.0 Beta1 or later.

You can't use this patch for prior versions and we will only release new plugin versions for former 2.0 elasticsearch versions if there are important issues.

But, you can fork the project, apply the same patch and build your own version if needed.

Is there any blocker for you to use the "workaround"?

jsok commented 8 years ago

Thanks for the explanation about releases. I wasn't familiar with the elastic search release model.

Can I suggest a clarification in the README that informs people that base_path (are there any others?) isn't settable via elasticsearch.yml? I'm sure I won't be the last to attempt this configuration.

dadoonet commented 8 years ago

@jsok Feel free to send a PR to this repository.

I changed the doc in the PR for next versions: https://github.com/elastic/elasticsearch/pull/12761/files#diff-59ee5813cfd6652346ece0f2a8da0a66R190