Open robin13 opened 9 years ago
I wonder if disabling verification would help in that case as a workaround.
@imotov Do you think we should add such an option (timeout
?) to the core snapshot&restore feature?
@robin13 is there a error message logged when this happens? Could you post the error message here?
Sorry - I cannot find the original error messages, but it looks to be similar to #149 I have asked the customer to repeat the procedure and will post the error messages if/when I get them.
I was able to simulate a bad connection using netem and got this error:
"error" : "RepositoryVerificationException[[my_s3_repository] [t_UCAimmREyhrEzfe7sIpA, 'RepositoryVerificationException[[my_s3_repository] store location [dr-elk] is not accessible on the node [[rclarke-node1][t_UCAimmREyhrEzfe7sIpA][es-rclarke][inet[/10.10.10.89:9300]]]]; nested: IOException[Unable to upload object tests-w1lIImwORQSdq3vJ_N8hfA-t_UCAimmREyhrEzfe7sIpA]; nested: AmazonS3Exception[Your socket connection to the server was not read from or written to within the timeout period. Idle connections will be closed. (Service: Amazon S3; Status Code: 400; Error Code: RequestTimeout; Request ID: D96E0A3F50D94D12)]; ']]]",
If creating a repository from outside AWS (e.g. Google Compute Engine), the verification process can take longer than the default 30s timeout, resulting in the repository not being created on all nodes in the cluster, and consequently not possible to create snapshots.
Would it be possible to add the timeout as a parameter to the S3 repository configuration?