elastic / elasticsearch-cloud-aws

AWS Cloud Plugin for Elasticsearch
https://github.com/elastic/elasticsearch/tree/master/plugins/discovery-ec2
577 stars 181 forks source link

When not on EC2, creating repository can cause timeout during verification #206

Open robin13 opened 9 years ago

robin13 commented 9 years ago

If creating a repository from outside AWS (e.g. Google Compute Engine), the verification process can take longer than the default 30s timeout, resulting in the repository not being created on all nodes in the cluster, and consequently not possible to create snapshots.

Would it be possible to add the timeout as a parameter to the S3 repository configuration?

dadoonet commented 9 years ago

I wonder if disabling verification would help in that case as a workaround.

@imotov Do you think we should add such an option (timeout?) to the core snapshot&restore feature?

imotov commented 9 years ago

@robin13 is there a error message logged when this happens? Could you post the error message here?

robin13 commented 9 years ago

See https://github.com/elastic/elasticsearch-cloud-aws/issues/149#issuecomment-93597887

robin13 commented 9 years ago

Sorry - I cannot find the original error messages, but it looks to be similar to #149 I have asked the customer to repeat the procedure and will post the error messages if/when I get them.

robin13 commented 9 years ago

I was able to simulate a bad connection using netem and got this error:

"error" : "RepositoryVerificationException[[my_s3_repository] [t_UCAimmREyhrEzfe7sIpA, 'RepositoryVerificationException[[my_s3_repository] store location [dr-elk] is not accessible on the node [[rclarke-node1][t_UCAimmREyhrEzfe7sIpA][es-rclarke][inet[/10.10.10.89:9300]]]]; nested: IOException[Unable to upload object tests-w1lIImwORQSdq3vJ_N8hfA-t_UCAimmREyhrEzfe7sIpA]; nested: AmazonS3Exception[Your socket connection to the server was not read from or written to within the timeout period. Idle connections will be closed. (Service: Amazon S3; Status Code: 400; Error Code: RequestTimeout; Request ID: D96E0A3F50D94D12)]; ']]]",