elastic / elasticsearch-cloud-aws

AWS Cloud Plugin for Elasticsearch
https://github.com/elastic/elasticsearch/tree/master/plugins/discovery-ec2
577 stars 181 forks source link

Add compatibility to CEPH #155

Closed armin-bauer closed 9 years ago

armin-bauer commented 9 years ago

I'm currently trying to get the elastic search cloud-aws plugin to run with CEPH, through their s3-compatible api.

unfortunatly while doing a snapshot, i'm getting an error:

curl -XPUT 'http://es-node-1:9200/_snapshot/s3_backup' -d '{"type": "s3", "settings": { "bucket": "backup-test", "endpoint": "", "protocol":"https", "access_key": "", "secret_key": "" }}'

this operation is acknowledged.

curl -XPUT 'http://es-node-1:9200/_snapshot/s3_backup/snapshot?wait_for_completion=true'

this waits for a whole lot of time and then an exception is thrown...

{"error":"AmazonS3Exception[null (Service: Amazon S3; Status Code: 400; Error Code: InvalidArgument; Request ID: null)]","status":500}%

a lot of files have been created in the bucket, the error message makes it seem like copying works and then something happens... Did anyone have a simmilar problem?

Any ideas?

The error message in the server log is:

I'm using elasticsearch 1.4.0 and the 2.4.2 snapshot version of the plugin

[2014-12-18 16:45:22,958][WARN ][snapshots ] [es-node-3] [s3_backup:snapshot] failed to finalize snapshot com.amazonaws.services.s3.model.AmazonS3Exception: null (Service: Amazon S3; Status Code: 400; Error Code: InvalidArgument; Request ID: null), S3 Extended Request ID: null at com.amazonaws.http.AmazonHttpClient.handleErrorResponse(AmazonHttpClient.java:1077) at com.amazonaws.http.AmazonHttpClient.executeOneRequest(AmazonHttpClient.java:725) at com.amazonaws.http.AmazonHttpClient.executeHelper(AmazonHttpClient.java:460) at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:295) at com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:3699) at com.amazonaws.services.s3.AmazonS3Client.getObject(AmazonS3Client.java:1135) at com.amazonaws.services.s3.AmazonS3Client.getObject(AmazonS3Client.java:1007) at org.elasticsearch.cloud.aws.blobstore.S3BlobContainer.openInput(S3BlobContainer.java:82) at org.elasticsearch.repositories.blobstore.BlobStoreRepository.readSnapshot(BlobStoreRepository.java:404) at org.elasticsearch.repositories.blobstore.BlobStoreRepository.finalizeSnapshot(BlobStoreRepository.java:326) at org.elasticsearch.snapshots.SnapshotsService$7.run(SnapshotsService.java:976) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:724)

tlrx commented 9 years ago

@armin-bauer thanks for reporting. Just to be sure: the version 2.4.2 of the plugin is not released, are you using a snapshot release or the version 2.4.1?

This is the first time I saw an error at this place so I think further investigation will be needed. In the meanwhile, can you please tell me if you already succeed to snapshot on CEPH before? Can you please try to snapshot again but with a base_path configured in the repository (and verify that you have the correct access/privileges on the bucket)?

S3 compatible services sometimes misbehave with the AWS Java library.

armin-bauer commented 9 years ago

@tlrx i never managed to backup to ceph due to this error. It occured somewhere in the middle of the backup, so the whole procedure was too flaky for a backup. There were some files storend to ceph but the backup was corrupt and could not be restored.

I used the current Version from 3 months ago as well as oder stuff. Maybe it's worth retrying with the new AWS sdk that was put in a few days ago. If i get the chance to try i'll post my findings.

andyHa commented 9 years ago

We tried it with 2.5.0 and got the same results. Privileges are OK as there were some files being created. I gave the base_path a go, but still got the same error. (We never succeeded to perform a snapshot into CEPH, but our own application uses it via the AWS SDK as well without any problems...)

How and where would I configure appropriate logging for the AWS plugin?

tlrx commented 9 years ago

@armin-bauer @andyHa Thanks for your valuable feedback.

I think we need to debug the plugin to see where the problem is. I tried to install CEPH to reproduce the bug but I can't get it work properly. I'll try later if I have time. That would be great if you can enable logging and post the result here (take care of removing sensitive information).

If you want to enable logging for the AWS plugin and AWS SDK, edit the file logging.yml in the config directory of elasticsearch and add the following lines:

logger:
  # AWS SDK logs
  com.amazonaws: TRACE
 # AWS Plugin logs 
  repositories.s3: TRACE
  cloud.aws: TRACE
herviou commented 9 years ago

check which version of AWS sdk you are using, downgrade to 1.8.x instead of a 1.9.y. I've had the same issue with ceph that doesn't not implements the v4 signin method. And it fixes my issue. See also https://github.com/ceph/s3-tests/issues/35

andyHa commented 9 years ago

@tlrx Can you guys verify that - and is there any reason not to downgrade the dependency to 1.8.x

herviou commented 9 years ago

Hi, guys I've got some good news for you if you want to try this fix it can be very easy : Before downgrading to 1.8.x try to set the AWS S3 Client with this :

ClientConfiguration configuration = new ClientConfiguration();
configuration.setProtocol(...);
....
// this force the client signer not to use v4 signin !
// check AmazonS3Client.createSigner method and AmazonS3Client for signer type available
configuration.setSignerOverride("S3SignerType");
...
AmazonClient client = new AmazonClient(new BasicAWSCredential(.,.), configuration);
client.setEndPoint...
dadoonet commented 9 years ago

That's great news ! I think we should do that using a new option like "signer" so CEPH users could use "S3SignerType".

dadoonet commented 9 years ago

Hey guys,

We implemented something in PR #202 and built a SNAPSHOT version for elasticsearch 1.5.x based on that. We would love to have your feedback and confirm that it fixes your issue.

To test it:

Stop your node.

Install new plugin version:

# Uninstall previous version
bin/plugin -remove cloud-aws
# Install version 2.5.1-SNAPSHOT
bin/plugin -install cloud-aws --url https://oss.sonatype.org/content/repositories/snapshots/org/elasticsearch/elasticsearch-cloud-aws/2.5.1-SNAPSHOT/elasticsearch-cloud-aws-2.5.1-20150421.082835-25.zip

Then edit your elasticsearch.yml file and set:

cloud.aws.signer: AWS3SignerType
# Or the following if you want to apply that only on S3 API
# cloud.aws.s3.signer: AWS3SignerType

Start your node and test again.

And report here! :)

Thanks for your help.

sbi-scireum commented 9 years ago

Setting the signer type to AWS3SignerType doesn't work for me with the newest plugin snapshot (using elasticsearch-cloud-aws-2.5.1-20150427.224854-32 from yesterday).

used setting in elasticsearch.yml: cloud.aws.signer: AWS3SignerType ES version on client and nodes: 1.5.0 Java version on client and nodes: 1.8.0_40 aws-java-sdk version on client: 1.9.23

Getting this error:

Caused by: com.amazonaws.services.s3.model.AmazonS3Exception: null (Service: Amazon S3; Status Code: 403; Error Code: AccessDenied; Request ID: null), S3 Extended Request ID: null at com.amazonaws.http.AmazonHttpClient.handleErrorResponse(AmazonHttpClient.java:1078) at com.amazonaws.http.AmazonHttpClient.executeOneRequest(AmazonHttpClient.java:726) at com.amazonaws.http.AmazonHttpClient.executeHelper(AmazonHttpClient.java:461) at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:296) at com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:3736) at com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:3686) at com.amazonaws.services.s3.AmazonS3Client.listObjects(AmazonS3Client.java:640) at com.amazonaws.services.s3.AmazonS3Client.listObjects(AmazonS3Client.java:623) at org.elasticsearch.cloud.aws.blobstore.S3BlobContainer.listBlobsByPrefix(S3BlobContainer.java:116) at org.elasticsearch.repositories.blobstore.BlobStoreRepository.snapshots(BlobStoreRepository.java:378) at org.elasticsearch.snapshots.SnapshotsService.snapshots(SnapshotsService.java:147) at org.elasticsearch.action.admin.cluster.snapshots.get.TransportGetSnapshotsAction.masterOperation(TransportGetSnapshotsAction.java:87) at org.elasticsearch.action.admin.cluster.snapshots.get.TransportGetSnapshotsAction.masterOperation(TransportGetSnapshotsAction.java:43) at org.elasticsearch.action.support.master.TransportMasterNodeOperationAction$3.run(TransportMasterNodeOperationAction.java:134) ... 3 more

dadoonet commented 9 years ago

@herviou As reported by @sbi-scireum, it seems that the patch we did in #202 did not work although you said it works on your end. Before reverting the change ( #213 ), could you test on your end if the latest SNAPSHOT version we produced works for you. In that case, we can keep the change. If not, we revert.

Thanks for your help!

herviou commented 9 years ago

Can't test on ElasticSearch with Ceph now...

dadoonet commented 9 years ago

@herviou So you never tested it? Any chance you could test that on your own platform? What is the platform name BTW?

herviou commented 9 years ago

@dadoonet I've tested it on our CEPH Plateform with the aws-java-sdk and found the same issue as yours.

dadoonet commented 9 years ago

@herviou Ok so can you try the plugin (SNAPSHOT version) on your CEPH instance? So we can say if it works or not?

herviou commented 9 years ago

I've check the code, can @sbi-scireum try with a configuration : cloud.aws.signer : S3SignerType instead of cloud.aws.signer : AWS3SignerType

On my own code I've the same behavior as @sbi-scireum with AWS3SignerType while with S3SignerType it works : just a quick check

dadoonet commented 9 years ago

@herviou Thanks!

When I looked at AWS SDK, I found that it should be AWS3SignerType but I might have misread this? https://github.com/aws/aws-sdk-java/blob/master/aws-java-sdk-core/src/main/java/com/amazonaws/auth/SignerFactory.java#L116

herviou commented 9 years ago

you're right and your code seems OK with the javadoc. But I've try S3SignerType vs AWS3SignerType and it seems to works : take a look at https://github.com/aws/aws-sdk-java/blob/master/aws-java-sdk-core/src/main/resources/awssdk_config_default.json

dadoonet commented 9 years ago

I see. It's region specific settings. Worth a try for sure :)