lericson / simples3

Simple, quick Amazon AWS S3 interface in Python
BSD 2-Clause "Simplified" License
96 stars 36 forks source link

Broken RFC 822 formatting on non-US locale #7

Closed dahlia closed 12 years ago

dahlia commented 12 years ago

The current implementation of RFC 822 formatting uses just time.strftime() function, and the formatter for it, defined in simples3.util.rfc822_fmt, is '%a, %d %b %Y %H:%M:%S GMT'. According to the documentation of time.strftime():

%a: Locale’s abbreviated weekday name. %b: Locale’s abbreviated month name.

As a result, the current RFC 822 formatting is broken in non-US locale (e.g. ko_KR, ja_JP). You can reproduce it easily:

>>> import time
>>> t = time.gmtime()
>>> time.strftime('%a, %d %b %Y %H:%M:%S GMT', t)
'Wed, 30 Nov 2011 08:29:14 GMT'
>>> import locale
>>> locale.setlocale(locale.LC_ALL, 'ko_KR')
'ko_KR'
>>> time.strftime('%a, %d %b %Y %H:%M:%S GMT', t)
'\xec\x88\x98, 30 11 2011 08:29:14 GMT'
>>> print _
수, 30 11 2011 08:29:14 GMT
>>> locale.setlocale(locale.LC_ALL, 'ja_JP')
'ja_JP'
>>> time.strftime('%a, %d %b %Y %H:%M:%S GMT', t)
'\xe6\xb0\xb4, 30 11 2011 08:29:14 GMT'
>>> print _
水, 30 11 2011 08:29:14 GMT

This bug also produces incorrect signatures (used for AWS authentication).

yoloseem commented 12 years ago

In Ubuntu Linux, its behavior (especially %b formatter) is slightly different:


>>> locale.setlocale(locale.LC_ALL, 'ko_KR.UTF8')
'ko_KR.UTF8'
>>> time.strftime('%a, %d %b %Y %H:%M:%S GMT', t)
'\xec\x88\x98, 30 11\xec\x9b\x94 2011 09:55:06 GMT'
>>> print _
수, 30 11월 2011 09:55:06 GMT
jbergstroem commented 12 years ago

Similar issues are common in python land. In gentoo, one generally recommends to switch locales before testing in style with LC_CTYPE=en_US.utf8 python foobar.py.