duointeractive / sea-cucumber

A Django email backend for Amazon Simple Email Service, backed by celery
http://pypi.python.org/pypi/seacucumber/
MIT License
134 stars 35 forks source link

Remove the "utf8 patch". #4

Closed originell closed 11 years ago

originell commented 12 years ago

The patch doesn't really solve the problem since boto does a base64 encode in send_raw_email which in turn does not work with unicode strings containing special characters (öäüá… – at least on py2x).

You can verify that by doing f.e.:

import base64
base64.b64encode(u'Fööbär')

By removing the decode (which is used to turn ascii to unicode), we actually support sending unicode strings. I know that sounds crazy but here is an example:

import base64
base64.b64encode(u'Fööbär'.encode('ascii'))

Works without an exception :-) When further debugging we can actually see that boto sets the correct UTF8 header for the mail, so that works flawlessly.

I ran into this while trying to send a german mail (so umlauts included) which in turn was generated (and translated) via django's render_to_string functionality (and i18n ofc).

Removing the patch solved my problem, however it does require that you pass in ASCII strings (so 'Fööbär'.encode('utf8')) or, as in my case, django.utils.safestring.SafeUnicode (as returned by django.template.loader.render_to_string).

That whole situation is pretty messed up since boto does not do an .encode('utf8') when at boto/ses/connection.py:L281 -.-

Another way to solve this issue would be to do an isinstance(var, unicode) check and do the encode part before delaying the SendEmailTask. This could be more favorable since it hides an implementation detail. On the other side it's much more implicit then,… Well I guess you will have to decide :-)

gtaylor commented 12 years ago

If you have time to submit a PR with the isinstance route taken, I'd be interested in giving it a shot.

originell commented 12 years ago

Well I'm just gonna update this one :)

originell commented 12 years ago

While trying the isinstance route (and quickly abandoning it since a try-except block would be better because objects could be unicode-like but not a direct instance of unicode) I found out that message.message().as_string() always returns an ascii-encoded string, so there is/should be no need to do an isinstance/try-except/decode.

I'm just wondering now why there was a need for the utf8 fix in the first place. I can't reproduce it, at least not with my german version of a registration mail :-S Would be great to have @asiermartinez opinion here I guess?

djw commented 12 years ago

There's a similar discussion going on over here: https://github.com/hmarr/django-ses/pull/41

In short, there was a bug introduced into boto 2.5.x (boto/boto@bb236ec) which meant that you had to supply a unicode string for raw_message. message.message().as_string() produces a string which is already encoded, hence the need to decode before passing to boto 2.5.x.

I fixed this a couple of months ago (boto/boto@5c1faab), but there hasn't been a release since then.

originell commented 12 years ago

Ohh okay. Thanks for the info! :)

gtaylor commented 12 years ago

Anyone running under boto develop able to comment on where we stand on this? I'm not terribly interested in maintaining hacks for old versions of boto, so if we're working after boto 2.5, I may mark this one as closed.