apache / libcloud

Apache Libcloud is a Python library which hides differences between different cloud provider APIs and allows you to manage different cloud resources through a unified and easy to use API.
https://libcloud.apache.org
Apache License 2.0
2.04k stars 926 forks source link

Object metadata keys are lowercased when uploading to GCS #1530

Open sdcubber opened 3 years ago

sdcubber commented 3 years ago

Summary

Mixed-case metadata is lowercased when using libcloud to upload objects to Google Cloud Storage.

Detailed Information

I'm using libcloud to upload files to a Google Cloud Storage bucket together with object metadata. In the process, the keys in my metadata dict are being lowercased. This is not due to Google Cloud Storage, which does support mixed case metadata.

The issue can be reproduced following the example from the docs:

from libcloud.storage.types import Provider
from libcloud.storage.providers import get_driver

cls = get_driver(Provider.GOOGLE_STORAGE)
driver = cls('SA-EMAIL', './SA.json') # provide service account credentials here

FILE_PATH = '/home/user/file'

extra = {'meta_data': {'camelCase': 'foo'}}

# Upload with metadata
with open(FILE_PATH, 'rb') as iterator:
    obj = driver.upload_object_via_stream(iterator=iterator,
                                          container=container,
                                          object_name='file',
                                          extra=extra)

The file uploads succesfully, but resulting metadata will look as follows: image

Where camelCase has been turned into camelcase.

I'm using apache-libcloud==3.0.0 with Python 3.8 on Linux (the python:3.8 Docker image).

Kami commented 3 years ago

Thanks for reporting this.

I assume this was done for some kind of cross-provider compatibility, but I'm not sure.

I think changing it should be fairly straight forward, but the change is backward incompatible which means it would need to be document in upgrade notes and we would also need to have a way to revert back to the old behavior (or perhaps have it as an opt-in instead of opt-out).

Kami commented 3 years ago

I just had a look - I assume that's limitation of the S3 API.

I verified and S3 API lower cases all the metadata keys (also if you create metadata items through the web interface).

Google Storage driver is based on the S3 one and utilizes S3 compatible API so it's likely S3 API limitation and not much we can do besides implementing native Google Storage API (this would likely take much more work).

Kami commented 3 years ago

I also verified we correctly pass headers as provided by the user directly to the API endpoint.

Per S3 API docs, it's indeed limitation of the S3 API - https://docs.aws.amazon.com/AmazonS3/latest/dev/UsingMetadata.html:

... User-defined metadata is a set of key-value pairs. Amazon S3 stores user-defined metadata keys in lowercase. ...

And here are a couple of more links for reference::

In short, since metadata in our implementation is sent as part of the HTTP request headers, using only lower case value is safer in any case.

And as said above - it appears to work via the console since that likely utilizes native Google Storage JSON API and not the S3 compatible XML one.

So maybe for now we just document this as a limitation in our Google Storage driver docs?

Kami commented 3 years ago

For now I added this note to the docs - be322400028ee9d27e1935dc1b15aa3f6c28cf16. Hope it helps.

Per that note - even if we ever support Google Storage JSON API, it's probably still better to not rely on mixed casing in case cross provider compatibility is desired.

sdcubber commented 3 years ago

Alright, clear. Thanks for sorting this out!

stale[bot] commented 3 years ago

Thanks for contributing to this issue. As it has been 90 days since the last activity, we are automatically marking is as stale. If this issue is not relevant or applicable anymore (problem has been fixed in a new version or similar), please close the issue or let us know so we can close it. On the contrary, if the issue is still relevant, there is nothing you need to do, but if you have any additional details or context which would help us when working on this issue, please include it as a comment to this issue.