boto / boto

For the latest version of boto, see https://github.com/boto/boto3 -- Python interface to Amazon Web Services
http://docs.pythonboto.org/
Other
6.48k stars 2.26k forks source link

S3 server-side encryption using AWS KMS doesn't work, signature problem #2921

Open carlbolduc opened 9 years ago

carlbolduc commented 9 years ago

Hi,

I want to do what is described here in Java (second sample).

My code:

from boto.s3.connection import S3Connection
from boto.s3.key import Key
import os

os.environ['S3_USE_SIGV4'] = 'True'

AWS_ACCESS_KEY = "***"
AWS_SECRET_KEY = "***"
BUCKET_NAME = "testjavaencryption"
S3_FILE_PATH = "encrypted.txt"

f = open('labibi.txt')
result = ""
for line in f.readlines():
    result += line

def uploadToS3(fileName, fileContent):
    connection = S3Connection(AWS_ACCESS_KEY, AWS_SECRET_KEY, host='s3.amazonaws.com', debug=2)
    bucket = connection.get_bucket(BUCKET_NAME)
    key = Key(bucket)
    key.key = fileName
    key.set_contents_from_string(fileContent, headers={'x-amz-server-side-encryption' : 'aws:kms','x-    amz-server-side-encryption-aws-kms-key-id' : 'arn:aws:kms:us-east-1:***'})

uploadToS3(fileName=S3_FILE_PATH, fileContent=result)

The PUT request fails, I get the following error in my response:

SignatureDoesNotMatch.
The request signature we calculated does not match the signature you provided. Check your key and signing method.

If I remove the headers which contains my kms key, the PUT request works. There appears to be a problem with how boto is calculating the signature.

Since I have Java code that works, I compared the two PUT requests.

Python

PUT /encrypted.txt HTTP/1.1\r\n
Accept-Encoding: identity\r\n
x-amz-content-sha256: ***\r\n
Content-Length: 20\r\n
Host: testjavaencryption.s3.amazonaws.com\r\n
Content-MD5: ***\r\n
x-amz-server-side-encryption-aws-kms-key-id: arn:aws:kms:us-east-1:***\r\n
Expect: 100-Continue\r\n
X-Amz-Date: 20150128T195418Z\r\n
Authorization: AWS4-HMAC-SHA256 Credential=***/20150128/us-east-1/s3/aws4_request,SignedHeaders=content-length;content-md5;content-type;expect;host;user-agent;x-amz-content-sha256;x-amz-date;x-amz-server-side-encryption;x-amz-server-side-encryption-aws-kms-key-id,Signature=***\r\n
x-amz-server-side-encryption: aws:kms\r\n
Content-Type: application/octet-stream\r\n
User-Agent: Boto/2.34.0 Python/2.7.8 Windows/8\r\n\r\n'

Java

PUT /hello_s3_sse_kms.txt HTTP/1.1[\r][\n]
x-amz-content-sha256: STREAMING-AWS4-HMAC-SHA256-PAYLOAD[\r][\n]
Content-Length: 194[\r][\n]
Host: testjavaencryption.s3-external-1.amazonaws.com[\r][\n]
x-amz-server-side-encryption-aws-kms-key-id: arn:aws:kms:us-east-1:**[\r][\n]
Expect: 100-continue[\r][\n]
[\r][\n]
HTTP/1.1 100 Continue[\r][\n]
[\r][\n]
15; chunk-signature=***[\r][\n]
la bibi da bibi dum[\r][\n]
[\r][\n]
0;chunk-signature=***[\r][\n]
X-Amz-Date: 20150128T195535Z[\r][\n]
Authorization: AWS4-HMAC-SHA256 Credential=***/20150128/us-east-1/s3/aws4_request, SignedHeaders=content-length;content-type;host;user-agent;x-amz-content-sha256;x-amz-date;x-amz-decoded-content-length;x-amz-server-side-encryption;x-amz-server-side-encryption-aws-kms-key-id, Signature=***[\r][\n]
x-amz-server-side-encryption: aws:kms[\r][\n]
Content-Type: application/octet-stream[\r][\n]
User-Agent: aws-sdk-java/1.9.6 Windows_8.1/6.3 Java_HotSpot(TM)_64-Bit_Server_VM/25.25-b02/1.8.0_25[\r][\n]
x-amz-decoded-content-length: 21[\r][\n]
Connection: Keep-Alive[\r][\n]

As you can see, there are a few differences.

  1. x-amz-content-sha256 is different, Python's request contains a hash while the Java request seems to contain a type of hashing.
  2. Content-Length is much smaller with boto. I noticed that my Java code adds a metadata to the document with the length of the document, this is absent from the Python code. Still, the Python code works if I remove the kms headers.
  3. Host is different, although I doubt that it matters.
  4. Java is missing Content-MD5, but it works.
  5. With Java, Expect contains the file data. It might be the case with Python if I wasn't getting an error with the signature.
  6. The signature is different in the Authorization header (expected since Python signature is bad and the one from Java is good).
tpodowd commented 9 years ago

Hi Carl,

I chatted with you on boto users list about this before. I've recently been experimenting with SigV4 on Amazon and boto. I know a bit more about things now.

1. x-amz-content-sha256 is different, Python's request contains a hash while the Java request seems to contain a type of hashing.

This is no problem. Boto and Java are using two different methods to send the data. Boto is calculating the signature for the entire payload before sending and Java is using aws-chunking (which Boto doesn't support yet - although I am currently adding support).

2. Content-Length is much smaller with boto. I noticed that my Java code adds a metadata to the document with the length of the document, this is absent from the Python code. Still, the Python code works if I remove the kms headers.

Again, same thing. the content length difference is due to aws-chunked mode for Java. Both should be ok.

3. Host is different, although I doubt that it matters.

You might be wrong. I think that is what the problem is.

4. Java is missing Content-MD5, but it works.

No problem. Content-MD5 is not required as using sigv4 the whole body is signed so it does the same check effectively.

5. With Java, Expect contains the file data. It might be the case with Python if I wasn't getting an error with the signature.

yep.

6. The signature is different in the Authorization header (expected since Python signature is bad and the one from Java is good).

Different headers lead to different signatures so it's hard to know by just this. However, the signature is wrong so, we have to work with that.

Ok, so my suggestion is to modify the host header so it matches the Java one. Try this.

connection = S3Connection(AWS_ACCESS_KEY, AWS_SECRET_KEY, host='s3-external-1.amazonaws.com', ....)

Let me know if that works or not.

Tom.

carlbolduc commented 9 years ago

I modified the host header, here is the new PUT:

PUT /encrypted.txt HTTP/1.1
Accept-Encoding: identityx-amz-content-sha256: ***
Content-Length: 20
Host: testjavaencryption.s3-external-1.amazonaws.com
Content-MD5: ***
x-amz-server-side-encryption-aws-kms-key-id: arn:aws:kms:us-east-1:***:key/***
Expect: 100-Continue
X-Amz-Date: 20150312T153006Z
Authorization: AWS4-HMAC-SHA256 Credential=***/20150312/external-1/s3/aws4_request,SignedHeaders=content-length;content-md5;content-type;expect;host;user-agent;x-amz-content-sha256;x-amz-date;x-amz-server-side-encryption;x-amz-server-side-encryption-aws-kms-key-id,Signature=***
x-amz-server-side-encryption: aws:kms
Content-Type: application/octet-stream
User-Agent: Boto/2.34.0 Python/2.7.8 Windows/8

I get the following error:

AuthorizationHeaderMalformed
The authorization header is malformed; the region 'external-1' is wrong; expecting 'us-east-1'

As you can see, both Host and Authorization headers get modified to use s3-external-1 with boto. With Java, only Host contains s3-external-1 while Authorization keeps us-east-1.

tpodowd commented 9 years ago

Nice timing. I think I may have found the problem. I've a patch that I am testing. The signature calculation is off in Boto under certain conditions and this looks to be the same. I'll let you know when I create a pull request so you can possibly try it out.

crizCraig commented 9 years ago

Edit: Things seem to work with @tpodowd's header fix. I made one addition to forego the etag check for kms encrypted files here: https://github.com/tpodowd/boto/pull/2 This prevents boto from throwing an exception after successful upload.

ssrikanta commented 9 years ago

@crizCraig , @tpodowd Looks like these changes are not available on the master branch yet. Please let when this will be available merged with master branch. Also please let us know if any workaround until we get changes available

spg commented 9 years ago

I'd be interested in the temp workaround, if any.

crizCraig commented 9 years ago

I am just using the code in my pull request in a local module until the official fix is in.

Then

sys.path.insert(0, os.path.dirname(os.path.abspath(__file__)) + '/boto')

Import boto

On Tue, May 19, 2015 at 8:55 AM Simon-Pierre Gingras < notifications@github.com> wrote:

I'd be interested in the temp workaround, if any.

Reply to this email directly or view it on GitHub https://github.com/boto/boto/issues/2921#issuecomment-103561311.

chenziliang commented 8 years ago

Hi,

I tried boto S3 API to download the KMS encrypted keys, it reported "Requests specifying Server Side Encryption with AWS KMS managed keys require AWS Signature Version 4" error first. Then I set os.environ['S3_USE_SIGV4'] = 'True', but i encountered "S3ResponseError: 403 Forbidden" error. May I ask anyone encounted the same issue before.

svrana commented 8 years ago

@chenziliang Just setting a Key Policy for a KMS key is not enough when accessing an S3 bucket encrypted with KMS, which is a mistake I made msyelf. Make sure you've also created a policy for your user that allows KMS access, i.e., go to Services -> IAM -> Users -> Create Group Policy -> Policy Generator -> AWS Key Management Service -> All actions (or at least decrypt for S3) and select resources (* for all).

kurttheviking commented 8 years ago

+1

not sure what movement there is on this but for people who find themselves here; a workaround for now:

try:
    key.set_contents_from_string(fileContent, headers={
        'x-amz-server-side-encryption': 'aws:kms',
        'x-amz-ssekms-key-id': 'arn:aws:kms:...'
    })
except boto.exception.S3DataError, e:
    pass

using x-amz-ssekms-key-id doesn't seem to throw the signature error. the try/catch blocks the etag md5 error caused by using a kms key. (python 2.7.6 / boto 2.38.0 -- ymmv)

Edit

x-amz-ssekms-key-id does not apply the target key, instead a default is used; thus, this technique does not solve the underlying issue

yeukhon commented 8 years ago

Can we merge this patch? It's been almost a year. Thoughts?

ztane commented 7 years ago

To monkeypatch this issue, you can use

from boto.auth import HmacAuthV4Handler
from boto.s3.key import Key
from boto.exception import PleaseRetryException

def canonical_headers(self, headers_to_sign):
    """
    Return the headers that need to be included in the StringToSign
    in their canonical form by converting all header keys to lower
    case, sorting them in alphabetical order and then joining
    them into a string, separated by newlines.
    """
    # first clean the headers
    clean = {}
    for header in headers_to_sign:
        c_name = header.lower().strip()
        raw_value = str(headers_to_sign[header])
        if '"' in raw_value:
            c_value = raw_value.strip()
        else:
            c_value = ' '.join(raw_value.strip().split())
        clean[c_name] = c_value

    # then append them sorted by name only
    canonical = []
    for header in sorted(clean):
        canonical.append('%s:%s' % (header, clean[header]))
    return '\n'.join(canonical)

HmacAuthV4Handler.canonical_headers = canonical_headers

def should_retry(self, response, chunked_transfer=False):
    provider = self.bucket.connection.provider

    if not chunked_transfer:
        if response.status in [500, 503]:
            # 500 & 503 can be plain retries.
            return True

        if response.getheader('location'):
            # If there's a redirect, plain retry.
            return True

    if 200 <= response.status <= 299:
        self.etag = response.getheader('etag')
        md5 = self.md5
        if isinstance(md5, bytes):
            md5 = md5.decode('utf-8')

        # If you use customer-provided encryption keys, the ETag value that
        # Amazon S3 returns in the response will not be the MD5 of the
        # object.
        server_side_encryption_customer_algorithm = response.getheader(
            'x-amz-server-side-encryption-customer-algorithm', None)
        # check for kms headers, their ETag also doesn't match
        if server_side_encryption_customer_algorithm is None:
            server_side_encryption_customer_algorithm = response.getheader(
                'x-amz-server-side-encryption-aws-kms-key-id', None)

        if server_side_encryption_customer_algorithm is None:
            if self.etag != '"%s"' % md5:
                raise provider.storage_data_error(
                    'ETag from S3 did not match computed MD5. '
                    '%s vs. %s' % (self.etag, self.md5))

        return True

    if response.status == 400:
        # The 400 must be trapped so the retry handler can check to
        # see if it was a timeout.
        # If ``RequestTimeout`` is present, we'll retry. Otherwise, bomb
        # out.
        body = response.read()
        err = provider.storage_response_error(
            response.status,
            response.reason,
            body
        )

        if err.error_code in ['RequestTimeout']:
            raise PleaseRetryException(
                "Saw %s, retrying" % err.error_code,
                response=response
            )

    return False

Key.should_retry = should_retry