Closed bergwerf closed 1 year ago
Possibly related: https://github.com/localstack/localstack/issues/7514
Adding LS_LOG=trace
gives the following details:
AWS s3.PutObject => 404 (NoSuchBucket);
PutObjectRequest({
'ACL': 'private',
'Bucket': 'tests-3n0i79ZTSrmFVPhx12R8LQ',
'CacheControl': None,
'ContentDisposition': None,
'ContentEncoding': None,
'ContentLanguage': None,
'ContentLength': 195,
'ContentMD5': None,
'ContentType': 'application/octet-stream',
'ChecksumAlgorithm': None,
'ChecksumCRC32': None,
'ChecksumCRC32C': None,
'ChecksumSHA1': None,
'ChecksumSHA256': None,
'Expires': None,
'GrantFullControl': None,
'GrantRead': None,
'GrantReadACP': None,
'GrantWriteACP': None,
'Key': 'master.dat',
'Metadata': {},
'ServerSideEncryption': None,
'StorageClass': 'STANDARD',
'WebsiteRedirectLocation': None,
'SSECustomerAlgorithm': None,
'SSECustomerKey': None,
'SSECustomerKeyMD5': None,
'SSEKMSKeyId': None,
'SSEKMSEncryptionContext': None,
'BucketKeyEnabled': None,
'RequestPayer': None,
'Tagging': None,
'ObjectLockMode': None,
'ObjectLockRetainUntilDate': None,
'ObjectLockLegalHoldStatus': None,
'ExpectedBucketOwner': None,
'Body': <_io.BufferedReader>
}, headers={
'Host': 'crate-bucket.localhost:4566',
'amz-sdk-invocation-id': '725b8f05-6e2d-01ff-5c72-4bb2da6a3e17',
'amz-sdk-request': 'attempt=1;max=4',
'amz-sdk-retry': '0/0/500',
'Authorization': 'AWS4-HMAC-SHA256 Credential=test/20230520/us-east-1/s3/aws4_request, SignedHeaders=amz-sdk-invocation-id;amz-sdk-request;amz-sdk-retry;content-length;content-type;host;user-agent;x-amz-acl;x-amz-content-sha256;x-amz-date;x-amz-decoded-content-length;x-amz-storage-class, Signature=25beeaeb561ee449babaf261b4b47c9a86ffe3cdf3bd72f74dc8c61a6f231a95',
'Content-Type': 'application/octet-stream',
'User-Agent': 'aws-sdk-java/1.12.353 Linux/5.10.0-21-amd64 OpenJDK_64-Bit_Server_VM/20.0.1+9 java/20.0.1 vendor/Eclipse_Adoptium cfg/retry-mode/legacy',
'x-amz-acl': 'private',
'x-amz-content-sha256': 'STREAMING-AWS4-HMAC-SHA256-PAYLOAD',
'X-Amz-Date': '20230520T155139Z',
'x-amz-decoded-content-length': '22',
'x-amz-storage-class': 'STANDARD',
'Content-Length': '195',
'Connection': 'Keep-Alive',
'Expect': '100-continue',
'x-localstack-tgt-api': 's3',
'x-moto-account-id': '000000000000'
});
NoSuchBucket(The specified bucket does not exist, headers={
'Content-Type': 'application/xml',
'Content-Length': '245',
'x-amz-request-id': '47294c3a-6d9c-4845-bc8c-4667825f36d9',
'x-amz-id-2': 's9lzHYrFp76ZVxRcpX9+5cjAnEH2ROuNkd2BHfIa6UkFVdtjf5mKR3/eTPFvsiP/XV/VLi31234='
}
It appears the CrateDB S3 plugin sends the wrong bucket name.
By setting rootLogger.level = debug
in sandbox/crate/config/log4j2.properties
I was able to confirm the following:
[2023-05-20T18:20:28,389][DEBUG][o.e.r.s.S3Repository ] [Ilmenspitze] using bucket [crate-bucket], chunk_size [1gb], server_side_encryption [false], buffer_size [100mb], cannedACL [], storageClass []
Which originates from https://github.com/bergwerf/cratedb/blob/master/plugins/es-repository-s3/src/main/java/org/elasticsearch/repositories/s3/S3Repository.java#L136.
Using Wireshark and some documentation digging I was able to determine that there is likely a mismatch between the Java AWS client library and Localstack.
The Java API used by CrateDB to send a PutObject to S3 is: https://docs.aws.amazon.com/AWSJavaSDK/latest/javadoc/com/amazonaws/services/s3/model/PutObjectRequest.html#PutObjectRequest-java.lang.String-java.lang.String-java.io.InputStream-com.amazonaws.services.s3.model.ObjectMetadata-
Here it is specified that:
When using this API with an access point, you must direct requests to the access point hostname.
The access point hostname takes the form AccessPointName-AccountId.s3-accesspoint.Region.amazonaws.com.
E.g. the endpoint should be a virtual host name: https://docs.aws.amazon.com/AmazonS3/latest/userguide/VirtualHosting.html
This is also supported by LocalStack, but (obviously) using a different top-level hostname: https://docs.localstack.cloud/user-guide/aws/s3/
In Java, the following error is produced:
IllegalArgumentException 'Endpoint does not contain a valid host name: http://crate-bucket.s3.us‑east‑2.localhost.localstack.cloud'
I suspect this originates from the AWS Java client library com.amazonaws.services.s3.model.PutObjectRequest
. Hence there is no straightforward way to fix this, and it will unfortunately not be possible to investigate the CrateDB demo extension via a local S3 mock server.