pycontribs / pyrax

The Python SDK for the Rackspace Cloud
developer.rackspace.com
Apache License 2.0
237 stars 208 forks source link

Problem with setting custom header 'Access-Control-Allow-Origin' for cloudfile in Rackspace #364

Closed alexmsu75 closed 10 years ago

alexmsu75 commented 10 years ago

Unable to configure 'Access-Control-Allow-Origin' header for file in RackspaceFiles Steps to reproduce: YYY - username XXX - apikey ZZZ - Rackspace service center (ORD/DFW/etc) CCC - Cloud Files Container Name (test_cont) FFF - Uploaded file name (t1.JPG)

[root@centos64x64v2x1 ~]# python -c 'import pyrax;pyrax.settings.set("identity_type", "rackspace");pyrax.settings.set("region","ZZZ");pyrax.set_credentials("YYY","XXX");cf=pyrax.cloudfiles;cont=cf.get_container("CCC");cf.object_meta_prefix="";obj=cont.get_object("FFF");print "t1.JPG metadata before:", cf.get_object_metadata(cont,obj);cf.set_object_metadata(cont,obj,{"Access-Control-Allow-Origin":"www.test.com"},clear=False);print "t1.JPG metadata after:", cf.get_object_metadata(cont,obj)' 

t1.JPG metadata before: {'content-length': '496987', 'accept-ranges': 'bytes', 'last-modified': 'Thu, 08 May 2014 13:51:09 GMT', 'etag': 'xxxxxxx, 'x-timestamp': '1399557068.10654', 'x-trans-id': 'tx9xxxxxxxxxxord1', 'date': 'Fri, 09 May 2014 00:56:43 GMT', 'access-control-allow-origin': 'www.othertest.com', 'content-type': 'image/jpeg'}

t1.JPG metadata after: {'content-length': '496987', 'accept-ranges': 'bytes', 'last-modified': 'Thu, 08 May 2014 13:51:09 GMT', 'etag': 'xxxxx', 'x-timestamp': '1399557068.10654', 'x-trans-id': 'txcxxxxxxxxxxxxord1', 'date': 'Fri, 09 May 2014 00:56:51 GMT', 'access-control-allow-origin': 'www.othertest.com', 'content-type': 'image/jpeg'}
[root@centos64x64v2x1 ~]#
alexmsu75 commented 10 years ago

I was working with Rackspace CouldFiles support and got advise to open the ticket here

EdLeafe commented 10 years ago

It's generally not a good idea to change the value of cf.object_meta_prefix - I should probably make that a read-only property. Instead, pass prefix="" as the parameters to set_object_metadata() and get_object_metadata(). That tells pyrax not to use the standard object metadata prefix for that operation.

Also, you cannot set the CORS meta on individual objects; these settings only apply to entire containers. See http://docs.rackspace.com/files/api/v1/cf-devguide/content/CORS_Container_Header-d1e1300.html for more details.

alexmsu75 commented 10 years ago

Thanks for quick response. My HTML5 developers are requesting to have 'Access-Control-Allow-Origin' header when CSS fonts are served from CDN. Rackspace has ability to configure this header (per file, not per container) via GUI or command line using curl. I was looking for the same functionality via pyrax and I was able to find a 'getter' (get_object_metadata) which works fine but 'setter' does not work as expected (by me).

If you are saying that I should not be able to configure 'Access-Control-Allow-Origin' header via pyrax then I'm out of luck since this feature is available from Rackspace.

Does any other way exist in pyrax that produces the same result as next curl commands? [alexmsu@centos64x64v2x1 ~]$ curl -H "X-Auth-Token: $RSAUTHTKN" -X POST -H "X-Container-Meta-Access-Control-Allow-Origin: www.othertest.com" https://storage101.ord1.clouddrive.com/v1/MossoCloudFS_$RSCFORDURL/CCC

[alexmsu@centos64x64v2x1 ~]$ curl -v -H "X-Auth-Token: $RSAUTHTKN" -X POST -H "Access-Control-Allow-Origin: www.othertest.com" https://storage101.ord1.clouddrive.com/v1/MossoCloudFS_$RSCFORDURL/CCC/FFF

[alexmsu@centos64x64v2x1 ~]$ curl -I -H "X-Auth-Token: $RSAUTHTKN" https://storage101.ord1.clouddrive.com/v1/MossoCloudFS_$RSCFORDURL/CCC/ HTTP/1.1 204 No Content Content-Length: 0 X-Container-Object-Count: 246 Accept-Ranges: bytes X-Container-Meta-Access-Log-Delivery: false X-Timestamp: 1398384373.26275 X-Container-Meta-Access-Control-Allow-Origin: www.othertest.com X-Container-Bytes-Used: 4768875 Content-Type: text/plain; charset=utf-8 X-Trans-Id: txxxxxxxxxxx-xxxxxxxxord1 Date: Thu, 08 May 2014 19:10:53 GMT

[alexmsu@centos64x64v2x1 ~]$ curl -I -H "X-Auth-Token: $RSAUTHTKN" https://storage101.ord1.clouddrive.com/v1/MossoCloudFS_$RSCFORDURL/CCC/FFF
HTTP/1.1 200 OK Content-Length: 496987 Accept-Ranges: bytes Last-Modified: Thu, 08 May 2014 13:51:09 GMT Etag: 4cxxxxxxxff X-Timestamp: 1399557068.10654 Access-Control-Allow-Origin: www.othertest.com Content-Type: image/jpeg X-Trans-Id: txxxxxxxxxxx-xxxxxxxxord1 Date: Thu, 08 May 2014 19:10:44 GMT

[alexmsu@centos64x64v2x1 ~]$

EdLeafe commented 10 years ago

Rackspace has ability to configure this header (per file, not per container) via GUI or command line using curl.

The API docs are pretty clear that these headers only apply to containers, not individual objects. I do see that the GUI control panel only allows for setting this on objects, and not containers. I'll talk to the docs people about that.

You can absolutely do the same as the curl command; you just need to add prefix="" to the set_object_metadata() call to tell pyrax not to use the default prefix.

alexmsu75 commented 10 years ago

Thanks for clarification. Can you please illustrate the approach with an example? I tried to use 'prefix' but no success

[root@centos64x64v2x1 ~]# python -c 'import pyrax;pyrax.settings.set("identity_type", "rackspace");pyrax.settings.set("region","ZZZ");pyrax.set_credentials("YYY","XXX");cf=pyrax.cloudfiles;cont=cf.get_container("CCC");cf.object_meta_prefix="";obj=cont.get_object("FFF");print "t1.JPG metadata before:", cf.get_object_metadata(cont,obj);cf.set_object_metadata(cont,obj,{"Access-Control-Allow-Origin":"www.test.com"},clear=False,prefix="");print "t1.JPG metadata after:", cf.get_object_metadata(cont,obj)'     
t1.JPG metadata before: {'content-length': '496987', 'accept-ranges': 'bytes', 'last-modified': 'Thu, 08 May 2014 13:51:09 GMT', 'etag': '4cxxxff', 'x-timestamp': '1399557068.10654', 'x-trans-id': 'txc3xxx-xxxord1', 'date': 'Sat, 10 May 2014 10:19:01 GMT', 'access-control-allow-origin': 'www.othertest.com', 'content-type': 'image/jpeg'}
t1.JPG metadata after: {'content-length': '496987', 'accept-ranges': 'bytes', 'last-modified': 'Thu, 08 May 2014 13:51:09 GMT', 'etag': '4cxxxff', 'x-timestamp': '1399557068.10654', 'x-trans-id': 'tx6xxx-xxxord1', 'date': 'Sat, 10 May 2014 10:19:01 GMT', 'access-control-allow-origin': 'www.othertest.com', 'content-type': 'image/jpeg'}
[root@centos64x64v2x1 ~]# 
EdLeafe commented 10 years ago

Try that code again, but remove the line:

cf.object_meta_prefix="" 

I've added an issue (#365) to make these values read-only.

alexmsu75 commented 10 years ago

cf.object_meta_prefix="" - was recommendation from Rackspace support. without cf.object_meta_prefix="" it does not work - get_object_metadata returns nothing, output just below:

t1.JPG metadata before: {}
t1.JPG metadata after: {}
EdLeafe commented 10 years ago

cf.object_meta_prefix="" - was recommendation from Rackspace support.

Well, then, that was a poor recommendation. ;-)

without cf.object_meta_prefix="" it does not work - get_object_metadata returns nothing, output just below:

Ah, I see the problem: there is no corresponding prefix parameter for get_object_metadata(). I've added an issue (#367), and have already added the code in my local copy, and verified that it is working as expected. It should be pushed to the working branch soon, and will appear in the next release.

michaelgeary commented 10 years ago

Ed, i believe i'm hitting the exact same issue. I am getting "X-Object-Meta-Access-Control-Allow-Origin:" headers instead of "Access-Control-Allow-Origin:". I believe this is preventing javascript from loading the asset (image) properly. I've checked out 1.8.1 of pyrax. I am trying to set this on both the container and/or the object, and am not getting anywhere. Are you able to revise https://github.com/rackspace/pyrax/blob/master/samples/cloudfiles/container_metadata.py (or the object version) to reflect what I should do?

EdLeafe commented 10 years ago

@michaelgeary Have you tried setting that value with prefix=""? Something like:

obj.set_metadata({"Access-Control-Allow-Origin": "http://example.com"}, prefix="")

Note that Cloud Files will allow you to set this on individual objects, but according to the docs: "Cloud Files supports CORS requests to containers and objects. CORS metadata is held on the container only. The values given apply to the container itself and all objects within it."

shredding commented 10 years ago

Have you tried setting that value with prefix=""

That's not working. It still applies the wrong headers.

ghost commented 10 years ago

I'm also having this issue. I've set "Access-Control-Allow-Origin" on all of my container objects, and it's resulting in the 'X-Container-Meta-Access-Control-Allow-Origin'. @shredding, have you had any leads on this? Does setting a header on the container level result in anything different?

shredding commented 10 years ago

I escalated that to the support of rackspace and they changed stuff for me, the answer was:

In this case I have taken two steps, I've set the header 'Access-Control-Allow-Origin: ' on your content as well as 'X-Container-Meta-Access-Control-Allow-Origin: ' on the container. In addition to this, I issued a purge for all content in the container to ensure that all content would be served from the origin.

Testing against the page, the header is now present and there are no more CORS errors on the developer console, can you confirm?

In order to ensure that changes to content are promptly made live in future, I might recommend the use of a shorted TTL for the container, 15 minutes is the minimum but you are free to use any value.

Since than it's working. It's stranged, because I tried the same before and it didn't work. Nonetheless, it was done via the webinterface and not the API.

ghost commented 10 years ago

Thanks for the swift reply @shredding! It's a shame the API isn't setting the correct headers off the bat. (Maybe there's a reasoning beyond my comprehension that it behaves the way it does.)

sivel commented 10 years ago

I'm going to try working through this from beginning to end and try to lay out the proper steps. This should be complete within the next week.

ghost commented 10 years ago

@sivel Can you be more specific? Are you going to show the steps of properly setting a head to a container?

sivel commented 10 years ago

Front to back, I plan on showing the appropriate way to set the values via pyrax (on individual objects and the container), validate that the changes are applied properly, and test the end result.

ghost commented 10 years ago

Thanks @sivel, i'll be looking forward to this tutorial.

shredding commented 10 years ago

I do so as well, because atm i have to manually do lots of clicky-clicky after each deployment.

shredding commented 10 years ago

Did you have any progress on that? We don't need a well written tutorial, just a few bullets.

ATM this is breaking my entire ci process, because i have to do cumbersome actions by hand after each deploy.

sivel commented 10 years ago

I have made some progress, but I need to put something in order. There is a lot of confusion, and to be honest the process is not as simple as I would like.

As soon as I have time to get my thoughts in order, I will provide an update here. This may not be until Wednesday.

As a short point, I would recommend adding these headers via the headers kwarg of cont.upload_file or cont.create, instead of attempting to manipulate the headers after the objects are uploaded via set_metadata calls.

Example:

cont.create(data='cors', obj_name='cors.txt', headers={'Access-Control-Allow-Origin': 'http://pyrax.example.org'})

Attempting to manipulate the headers via set_metadata is going to take some additional functions that are not currently part of pyrax to achieve the required results.

Additionally, I will also note:

  1. Container Level CORS headers only apply to Storage URL actions (not CDN)
  2. Object Level CORS headers only apply to CDN URL (not Storage) and only work with GET requests 3 Container Level CORS headers for CDN are planned

I'll provide more details soon.

sivel commented 10 years ago

I'll also note that simply supplying prefix='' via set_metadata does not work, as it also attempts to send headers that the server will not accept, and thus it ends up ignoring the whole request and not doing anything.

You have to specify prefix='', clear=True, which also means that if you don't want it to clear out other headers you may have set previously, you have to fetch all current headers, intelligently merge with new headers and perform a bit of header manipulation to ensure you are not sending headers that will cause the server to ignore your request.

sivel commented 10 years ago

As "promised" here is the outline I mentioned I would create for everyone. I hope this helps provide some better understanding.

There may be some additional work I need to do in order to make the process easier for where you don't want to be explicit in sending all headers or don't want to clear any other custom headers.

Where CORS headers work

  1. Container level CORS headers such as X-Container-Meta-Access-Control-Allow-Origin work for Storage URL operations. Specifically for FormPost and TempURL functionality. See CORS headers for containers.
  2. Object level CORS headers such as Access-Control-Allow-Origin work for CDN URL GET and HEAD operations. See CORS headers for objects.
  3. Container level CORS do not impact CDN functionality
  4. Object level CORS do not impact Storage functionality

How browsers interact with Object level CORS headers

When a browser is instructed to perform a CORS request for a "simple request", the browser will request the object, and then inspect the Access-Control-Allow-Origin header to determine if it is allowed to continue. See Simple Requests

This is different from how a browser will handle a more complex request, in that it will send a pre-flight OPTIONS request to validate beforehand if it can continue. See Preflighted Requests.

Storage URL CORS

This works just as described in the docs and is pretty easy to set on a container:

metadata = {
    'Access-Control-Allow-Origin': 'http://pyrax.example.org'
}
cont.set_metadata(metadata)

When performing actions against the Storage URL, you must also pass along X-Auth-Token with your requests.

CDN URL CORS

This is the source of most of the confusion and questions. Setting these headers can be complicated.

How metadata/headers are set or updated for objects

For objects, the POST request to set metadata deletes all metadata that is not explicitly set in the request. In other words, ALL the object metadata is set at the time of the POST request. If you want to edit or remove one header, include all other headers in the POST request and leave out the header that you want to remove. This means that if you delete one entry without posting the others, the others will also be deleted at that time.

See Create or update object metadata.

Setting CORS headers during object creation

This is one of the easiest ways to set the correct headers for an object

headers = {
    'Access-Control-Allow-Origin': 'http://pyrax.example.org'
}
cont.create(data='cors', obj_name='cors.txt', headers=headers)

or

headers = {
    'Access-Control-Allow-Origin': 'http://pyrax.example.org'
}
cont.upload_file('/path/to/cors.txt', obj_name='cors.txt', headers=headers)

Setting CORS headers with set_metadata, Option 1

This can be simple, but it involves resetting all headers on the object. Restting all headers is required, because when clear=False and prefix='', pyrax will attempt re-setting headers that the API will not accept, which will result in the API effectively ignoring the request. Those headers are:

Content-Length
X-Timestamp
Last-Modified
Etag
X-Trans-Id
Date
Accept-Ranges

Here is an example of how to do this:

metadata = {
    'Access-Control-Allow-Origin': 'http://pyrax.example.org'
}
obj.set_metadata(metadata, clear=True, prefix='')

In the above example, it will clear out any other custom headers, and explicicilty set only the headers specified in the metadata dictionary. This may be an acceptable soltuion, for those who are not setting additional custom headers outside of the CORS headers or for those who maintain what headers an object should have and treat that as their source of truth and do not rely on metadata and headers currently set on an object.

Setting CORS headers with set_metadata, Option 2

This is the most complicated form of setting CORS headers or non prefixed headers on an object.

Assuming that none of the above options are acceptable, and you want to update the headers and either set or replace Access-Control-Allow-Origin, it will require a bit more work.

In short, the steps required to make it work are:

  1. Get all prefixed metadata for the object (keys prefixed with X-Object-Meta-)
  2. "Massage" those keys, to prepend the prefix (X-Object-Meta-)
  3. Get all metadata/headers for the object
  4. Perform a case insensitive keyed merge of massaged prefixed keys, and all keys
  5. Perform a case insensitive keyed merge of all existing keys with new keys to be set (such as Access-Control-Allow-Origin)
  6. Wipe out the special/protected headers as outlined above
  7. Set the metadata on the object (This doesn't require clear=True since we are explicitly defining all headers/metadata)

This method requires a few functions which do not exist in pyrax.

Additionally, you will also see several calls to obj.manager.get_metadata instead of just obj.get_metadata. This is due to the prefix kwarg not being exposed at all levels of get_metadata functions. The prefix kwarg has been added to all of the get_metadata functions in the working branch.

Here is an example of adding the CORS headers, while keeping all previously set headers/metadata (some of this can be shortcut, if you maintain a list of all headers that an object should have, and rely on that as the source of truth as opposed to relying on the headers/metadata that is already set on an object.):

def lower_key(key):
    """Lowercase string, replace - with _"""
    return key.lower().replace("-", "_")

def header_insensitive_update(dct1, dct2):
    """Update header dict with case insensitive keys"""
    lowkeys = dict([(lower_key(key), key) for key in dct1])
    for key, val in dct2.items():
        d1_key = lowkeys.get(lower_key(key), key)
        dct1[d1_key] = val

def get_headers(obj, headers={}):
    """Helper to build headers dict with all keys to be set on an object

    excludes special keys, and properly handles OBJECT_META_PREFIX
    capitalization on already prefixed keys
    """

    # List of special keys that we cannot send
    no_touch_keys = [
        'content_length',
        'x_timestamp',
        'last_modified',
        'etag',
        'x_trans_id',
        'date',
        'accept_ranges'
    ]
    # We need to "massage" the prefixed keys, so they are formatted properly
    d = pyrax.object_storage._massage_metakeys(obj.get_metadata(),
            pyrax.object_storage.OBJECT_META_PREFIX)
    # Current master, doesn't support prefix on get_metadata directly
    # on the object, fetch *all* headers
    all_current = obj.manager.get_metadata(obj, prefix='')
    header_insensitive_update(d, all_current)
    header_insensitive_update(d, headers)
    # Wipe out special keys
    d.update(dict.fromkeys(no_touch_keys))
    return d

# Create a container
cont = pyrax.cloudfiles.create_container(pyrax.utils.random_ascii())
# Create an object
obj = cont.create(data='cors', obj_name='cors.txt')

# New headers to be set
headers = {
    'Access-Control-Allow-Origin': 'http://pyrax.example.org'
}
# Build full headers dict
d = get_headers(obj, headers)
# Set full headers, we can set clear=True here, but it's not needed since
# we are sending a full headers dict with special keys nulled out
obj.set_metadata(d, prefix='')

print obj.manager.get_metadata(obj, prefix='')
shredding commented 10 years ago

Thank you, I will try that today and report back!

shredding commented 10 years ago

Okay, did it know:


cf = pyrax.cloudfiles
metadata = {
    'Access-Control-Allow-Origin': '*'
}

cont = cf.get(rackspace_container)
cont.set_metadata(metadata)

targets = ['fonts/FontAwesome.otf', 'fonts/fontawesome-webfont.ttf', 'fonts/fontawesome-webfont.woff']
for target in targets:
    obj = cf.get_object(cont, target)
    obj.set_metadata(metadata, clear=True, prefix='')
    obj.purge()

When I now do run curl -I https://a6666f8b18b8592a45c5-6554b50e223d01473ab813fd09ed26e3.ssl.cf1.rackcdn.com/fonts/fontawesome-webfont.woff\?v\=4.1.0

I get the correct Access-Control-Allow-Origin Header:

HTTP/1.1 200 OK
Content-Length: 83760
Accept-Ranges: bytes
Last-Modified: Thu, 16 Oct 2014 06:58:12 GMT
ETag: fdf491ce5ff5b2da02708cd0e9864719
X-Timestamp: 1413442691.62260
Access-Control-Allow-Origin: *   
Content-Type: application/octet-stream
X-Trans-Id: txa430029325c94f5480263-00543f6cd8dfw1
Cache-Control: public, max-age=259200
Expires: Sun, 19 Oct 2014 06:59:36 GMT
Date: Thu, 16 Oct 2014 06:59:36 GMT
Connection: keep-alive

However, Chrome still complains, but that may be a caching issue. I'll report back.

shredding commented 10 years ago

Yop, it works. Thx!

sivel commented 10 years ago

With the fixes that have gone into working (to be released as 1.9.3 soonish) and the above description, I'm going to go ahead and close this issue.