AWS signatures should support Unicode characters

harrigan commented 12 years ago

See #108.

sputnick commented 12 years ago

Hey there,

we have done a few workarounds on parameters before submitting them to awssum to handle utf-8 encoding/escaping that is happening with Meteor and probably most web apps. Would this patch make those adjustment unneeded?

cheers Eiki, GreenQloud

Example code:

function encodeUTF8( s ){ return unescape( encodeURIComponent( s ) ); }

function decodeUTF8( s ){ return decodeURIComponent( escape( s ) ); }

function getBucketObjects(apiKey, secretKey, bucketName, prefix){

var bucketOptions = {
    BucketName : bucketName,
    Delimiter : '/',
    Prefix : encodeUTF8(prefix)
};

var result = getS3(apiKey, secretKey).ListObjects(bucketOptions);

   if(result.error){
    inspect(result.error, "Error");
    return new Meteor.Error(500, result.error); 
}
else{
    if (DEBUG)
        inspect(result.Body.ListBucketResult, 'Result');
    return result.Body.ListBucketResult;
}

}

function createFolder(apiKey, secretKey, bucketName, prefix, folderName) {

  var result = getS3(apiKey, secretKey).PutObject({
    BucketName : bucketName,
    ObjectName : encodeUTF8(prefix+folderName+'/'),
    ContentLength : 0,
    ContentType : 'application/x-directory',
    Body          : ''
});

if(result.error){
    inspect(result.error, "Error");
    return new Meteor.Error(500, result.error); 
}
else{
    if (DEBUG)
        inspect(result, 'Result');
    return result;
}

}

harrigan commented 12 years ago

I don't think so. This patch only changes the helper functions for AWS Signature v4 -- which CloudSearch, DynamoDB, Glacier, IAM and Storage Gateway use.

The S3 code appears to use it's own signature functions. But even if I patched those also, there are some limitations. For example, I don't think it's possible to create an S3 bucket with Unicode characters even from within the AWS console. I'm not sure about files and folder names.

chilts commented 12 years ago

Hey Guys,

Mike, thanks for the pull request. I've had a good sit down to figure out what was going on with Unicode. It all looks good now.

Added:

more unit tests using unicode
integration test for DynamoDB
integration test for SQS
integration test for S3
examples for a few services

I've pulled your request as well then adding a bunch of functionality to the esc() function. Unfortunately none of escape(), encodeURI() or encodeURIComponent() does what we need which is why I had to write something different. It now uses encodeURIComponent() to do the unicode part and then converts a number of other chars which Amazon doesn't want.

Eirikur, I tried your encodeUTF8() but it breaks a fair number of other unit tests. I suspect it would be mostly okay but doesn't have full coverage (e.g. space ' ' should be encoded as '%20' which unescape(encodeURIComponent(' ')) doesn't do).

All in all, I think this is now fixed, but please check all your code. You shouldn't need to use encodeUTF8() anymore, but let me know if this is still a problem.

Cheers, Andy

sputnick commented 12 years ago

Ok thanks Andy, we will test this to death on our end ;)

chilts / awssum

AWS signatures should support Unicode characters #109