GoogleCloudPlatform / appengine-gcs-client

App Engine-Cloud Storage custom client library
Apache License 2.0
124 stars 112 forks source link

listbucket raises UnicodeEncodeError #35

Open michael-veksler opened 8 years ago

michael-veksler commented 8 years ago

Here is part of a trace:

  File "/base/data/home/apps/......py", line ...., in .....
    for file_stat in gcs.listbucket(path):
  File "XXXX/cloudstorage/cloudstorage_api.py", line 550, in __iter__
    self._path + '?' + urllib.urlencode(self._options))
  File "/base/data/home/runtimes/python27/python27_dist/lib/python2.7/urllib.py", line 1307, in urlencode
    v = quote_plus(str(v))
UnicodeEncodeError: 'ascii' codec can't encode character u'\xbc' in position 45: ordinal not in range(128)

At this point, self._options contains {'marker': u'ZZZZ/check/\xbc', 'prefix': u'ZZZZ/'} Apparently, str(u'ZZZZ/check/\xbc') fails.

A fix I implemented seems to overcome this issue on our code base, but I don't want to invest too much time into understanding the implementation of appengine-gcs-client:

Index: cloudstorage/cloudstorage_api.py
===================================================================
--- cloudstorage/cloudstorage_api.py    (revision 9268)
+++ cloudstorage/cloudstorage_api.py    (revision 9269)
@@ -645,7 +649,11 @@
     if next_marker is None:
       self._options.pop('marker', None)
       return False
-    self._options['marker'] = next_marker
+    if isinstance(next_marker, unicode):
+      self._options['marker'] = next_marker.encode('utf8')
+    else: # Can this happen?
+      self._options['marker'] = str(next_marker) 
+
     return True

   def _find_elements(self, result, elements):