getlogbook / logbook

A cool logging replacement for Python.
http://logbook.readthedocs.org
Other
1.48k stars 165 forks source link

Fix utf-8 encoding issues for batched mails #221

Closed dioss-Machiel closed 8 years ago

dioss-Machiel commented 8 years ago

PR for #220

Edit: I haven't been developing in python long enough to understand the details of the encode and decode functions, if someone with more experience could take a look why this only works in python 3 that would be great.

coveralls commented 8 years ago

Coverage Status

Coverage decreased (-1.08%) to 74.183% when pulling 8cdc3f194403dda8aab2a4fb158e1cb83847d70e on Machiel-s:master into 790d87c9770fc5d1875ecc97ae89e94b667564de on getlogbook:master.

vmalloc commented 8 years ago

@Machiel-s thanks! however it seems the build fails on this PR

dioss-Machiel commented 8 years ago

Yes, it does fail but I am a bit stuck on why this is happening, Let me explain what I did: The bug that I described did not occur in the unit tests since the actual mail sending code where the encoding issue happens is bypassed. Since the issue is the .encode('ascii') I added that to the fake_mail_handler, however since the unit tests expect an array of bytes instead of a string I had to .decode('ascii') it again. For some reason this works fine on python 3 but not on python 2, Since I have only been developing python for a few days I feel a bit out of my league on what exactly is the issue here. I don't want to randomly start throwing in multiple encode/decodes so I would value the input of someone with a bit more expertise in this area.

vmalloc commented 8 years ago

Well, from a quick glance it seems the mail cannot be encoded as ascii, which makes sense since it's a very limited encoding. Perhaps you meant to convert to unicode and then to utf-8?

dioss-Machiel commented 8 years ago

I am trying to emulate this line: ''' 769 msg = _fix_eols(msg).encode('ascii') ''' from https://hg.python.org/cpython/file/3.4/Lib/smtplib.py

Which throws if UTF-8 is not passed to the set_payload call. What the UTF-8 parameter does is make the payload base64 encoded. Since base64 contains only ascii characters calling .encode('ascii') works fine.

However it seems that the python2 version of smtplib does not even support UTF-8, which makes the test fail since the payload is just passed as-is and is not base64 encoded. Then there is the issue of pypy3 also failing.

All in all it is starting to look like there should be different checks depending on the python version, but I feel this is not really maintainable. The only proper way to test all possible implementations to the fullest is to actually send a mail instead of stubbing out the mail code. However I feel like it does not add that more value for quite a lot of work.

I don't know how you want to proceed but I would suggest reverting the test code and just keeping the fix, since the fix doesn't break the unit tests anyway.

vmalloc commented 8 years ago

Ok. Can you force-push the change to remove the test changes? I'll merge it once it passes again.

coveralls commented 8 years ago

Coverage Status

Coverage remained the same at 75.261% when pulling c7551da3833919cb891645694912558d1970a4ab on Machiel-s:master into 790d87c9770fc5d1875ecc97ae89e94b667564de on getlogbook:master.

vmalloc commented 8 years ago

Thanks!