Closed d-maurer closed 3 years ago
This is broken:
from plone import api
from Products.MailHost.MailHost import MailHost
from transaction import commit
from zope.component.hooks import setSite
setSite(app["Plone"])
portal = api.portal.get()
mh = portal["MailHost"]
body = """
Willkommen John Doe, Ihr Benutzerzugang wurde erstellt. Ihr Benutzername lautet: john.doe1@local.test. Bitte aktivieren Sie diesen, indem Sie zu http://127.0.0.1:8080/Plone/passwordreset/495e356abda246ebe6da184b4f4c311?userid=john.doe1@local.test gehen. Aktivieren Sie Ihren Zugang bitte vor dem 09.12.2020 12:40.
Mit freundlichen Grüßen
--
Plone """
bbody = body.encode("utf-8")
mh.send(
bbody,
"...",
"...",
subject="Test",
charset="utf-8",
immediate=True,
)
commit()
This is ok:
from plone import api
from Products.MailHost.MailHost import MailHost
from transaction import commit
from zope.component.hooks import setSite
setSite(app["Plone"])
portal = api.portal.get()
mh = portal["MailHost"]
body = """
Willkommen John Doe, Ihr Benutzerzugang wurde erstellt.
Ihr Benutzername lautet: john.doe1@local.test.
Bitte aktivieren Sie diesen, indem Sie
zu http://127.0.0.1:8080/Plone/passwordreset/495e356abda246ebe6da184b4f4c311?userid=john.doe1@local.test gehen.
Aktivieren Sie Ihren Zugang bitte vor dem 09.12.2020 12:40.
Mit freundlichen Grüßen
--
Plone """
bbody = body.encode("utf-8")
mh.send(
bbody,
"...",
"...",
subject="Test",
charset="utf-8",
immediate=True,
)
commit()
The length of the Line is the Problem.
1letter wrote at 2020-12-3 00:55 -0800:
This is broken: ... In
email.encoders
you find a definition for function_qencode
. Change this fromdef _qencode(s): enc = _encodestring(s, quotetabs=True) # Must encode spaces, which quopri.encodestring() doesn't do return enc.replace(b' ', b'=20')
to
def _qencode(s): enc = _encodestring(s, quotetabs=True) # Must encode spaces, which quopri.encodestring() doesn't do return enc.replace(b' ', b'=20').replace("\n", "\r\n")
This should make the quoted printable encoding almost standard conform (the line length might still be too high by 1). Check whether this change fixes your problem.
I have reported the Python problem: "https://bugs.python.org/issue42549".
I have patched the Package here with an alternative policy and here
# MailHost.py#L388
from email import policy
mo = message_from_string(_string_transform(messageText, charset), policy=policy.SMTP)
# MailHost.py#L405
mo['Subject'] = str(Header(subject, charset, errors='replace'))
The correct policy is missing. A look in the Python docs clarified the situation. I think we should use email.policy.SMTP and not the Default Policy.
I have test a little bit the mail creation without any patch Products.MailHost. The only one solution that i found is the follwing code snippet:
LONG_MULTI_LINE = """\
From: {}
To: {}
Content-Type: text/plain
Precedence: bulk
Dear Passengers,
Curabitur aliquet quam id dui posuere blandit. Vivamus magna justo, lacinia eget consectetur sed, convallis at tellus. Vivamus magna justo, lacinia eget consectetur sed, convallis at tellus.
Lorem ipsum dolor sit amet, consectetur adipiscing elit. http://127.0.0.1:8080/test/test/portal_setup/blub Curabitur non nulla sit amet nisl tempus convallis quis ac lectus. Mauris blandit aliquet elit, eget tincidunt nibh pulvinar a.
A little Umlaut: "üüß"
Vestibulum ante ipsum primis in faucibus orci luctus et ultrices posuere cubilia Curae; Donec velit neque, auctor sit amet aliquam vel, ullamcorper sit amet ligula. Lorem ipsum dolor sit amet, consectetur adipiscing elit. Donec rutrum congue leo eget malesuada.
thanks, your plone
"""
from_address = api.portal.get_registry_record('plone.email_from_address')
from_name = api.portal.get_registry_record('plone.email_from_name')
sender = formataddr((from_name, from_address))
if parseaddr(sender)[1] != from_address:
sender = from_address
# EmailMessage Object with the SMTP Policy doesn't destroy the Multiline
msg = message_from_string(LONG_MULTI_LINE.format(sender, to), policy=policy.SMTP)
msg['Subject'] = Header(subject, "utf-8").encode()
# Add Attachments from a namedblobfile
for mailattachment in attachments:
msg.add_attachment(mailattachment.data, maintype='application', subtype='octet-stream', filename=attachment.filename)
# Send via MailHost
host = api.portal.get_tool('MailHost')
host.send(
msg,
"john.doe.recipient@local.test",
"john.doe.sender@local.test",
immediate=False,
charset='utf-8',
msg_type='text/plain'
)
1letter wrote at 2020-12-7 05:29 -0800:
I have test a little bit the mail creation without any patch Products.MailHost. The only one solution that i found is the follwing code snippet:
I have read the email.policy
documentation this morning.
It tells that the standard policy is not RFC (more precisely MIME)
compatible. As Products.MailHost
targets mail delivery,
the messages it creates should by MIME compatible.
Thus, Products.MailHost
should change as you have proposed.
Tests are a likely problem. Those tests mockup a later delivery stage
and test against the internal messages passed on to that stage.
As a consequence, most tests would need adaptations.
This applies to the tests of Products.MailHost
(easy) but
also to tests elsewhere -- e.g. Plone/application tests.
Yes, i agree. But who tell this the community? ;-)
@mauritsvanrees This issue demonstrates that Products.MailHost
does not produce standard conform messages which can cause problems in some environments (more precisely with SMTP agents which treat non standard messages wrongly). The fix is likely simple but could break email related tests -- in a way similar you have observed in "https://github.com/plone/buildout.coredev/pull/670#issuecomment-673517109". Do you have suggestion how we should handle this issue?
I have made a test via a local postfix mailer - all is fine. My other Tests with the encoding problem going all out via smtp and ms exchange.
@d-maurer If you make a PR for Products.MailHost, I can start Plone Jenkins jobs to test it, or run some locally to get a general idea.
Or you may be able to start them yourself:
Maurits van Rees wrote at 2020-12-11 02:37 -0800:
@d-maurer If you make a PR for Products.MailHost, I can start Plone Jenkins jobs to test it, or run some locally to get a general idea.
Thank you for your response.
I will let @1letter make the PR (unlike me, he can verify that everything works in a "hostile" environment). I will help as necessary.
@d-maurer Ok, i have patched some lines in Products.MailHost and Products.CMFPlone locally. But some Tests failing. My problem, i don't know enough the internals of headers mailconstruction. for example
# that is in a test of Products.CMFPlone
self.assertEqual(msg['Subject'], '=?utf-8?q?Password_reset_request?=')
with the policy.smtp in _munge_headers
if subject:
# remove any existing header otherwise we get two
del mo['Subject']
# Perhaps we should ignore errors here and pass 8bit strings
# on encoding errors
header_subject = Header('None ASCII Char Töst', charset, errors='replace')
mo['Subject'] = header_subject.encode()
import pdb;
pdb.set_trace()
-> if mto:
(Pdb) pp mo['Subject']
'None ASCII Char Töst'
(Pdb) pp header_subject.encode()
'=?utf-8?q?None_ASCII_Char_T=C3=B6st?='
(Pdb) pp type(mo['Subject'])
<class 'email.headerregistry._UniqueUnstructuredHeader'>
self.assertEqual(msg['Subject'], 'None ASCII Char Töst')
it would help refactor the tests, but i don't know is this the right way. In other words: what ist the correct construction of the subject line?
Subject: '=?utf-8?q?None_ASCII_Char_T=C3=B6st?='
or Subject: None ASCII Char Töst
have you a hint or suggestion?
1letter wrote at 2021-1-10 12:46 -0800:
@d-maurer
... it would help refactor the tests, but i don't know is this the right way. In other words: what ist the correct construction of the subject line?
What you have have changed should only affect line endings -- nothing inside a header itself (unless the header contained line endings or was split due to overlength).
Formerly, line endings could be "\n"; with your change line endings should be "\r\n". Nothing else should change due to your modification.
-- Dieter
Dieter Maurer wrote at 2021-1-10 23:25 +0100:
1letter wrote at 2021-1-10 12:46 -0800:
@d-maurer
... it would help refactor the tests, but i don't know is this the right way. In other words: what ist the correct construction of the subject line?
What you have have changed should only affect line endings -- nothing inside a header itself (unless the header contained line endings or was split due to overlength).
Formerly, line endings could be "\n"; with your change line endings should be "\r\n". Nothing else should change due to your modification.
I may understand why more changes:
Before your change, MailHost
has used a policy derived from
Compat32
(this stands for "compatibility with Python 3.2").
From previous discussions I suppose that you now derive from
the SMTP policy which is not based on Compat32
.
The documentation tells that Compat32
exists for compatibility
reasons but has known bugs (likely related to standard conformance).
Thus, the implicit move from Compat32
may introduce changed behavior.
I have not understood from your previous description what goes wrong with the test. Do you get an exception or a failure? What failure exactly?
Apart from the strange None
at the start of the reported header
value, the value treatment looks correct.
The "None" is only a word, it can be everything, like "car", "house, "color" - it's dummy text for test.
I only changed the call of helper function message_from_string(...., policy=policy.SMTP)
in some cases in Products.MailHost and Products.CMFPlone. The effect of this change:
Before change, the subject header contain this: '=?utf-8?q?my subject?='
After change, the subject header contain this:my subject
, without the utf-8 stuff
My Question: Which of both is the correct Headerform? If is the second, then i must rewrite the Tests in Products.CMFPlone. There are Test with checks like this self.assertEqual(msg['Subject'], '=?utf-8?q?my subject?=')
. Do you know what i mean?
1letter wrote at 2021-1-10 22:41 -0800:
The "None" is only a word, it can be everything, like "car", "house, "color" - it's dummy text for test. I only changed the call of helper function
message_from_string(...., policy=policy.SMTP)
in some cases in Products.MailHost and Products.CMFPlone. The effect of this change:Before change, the subject header contain this:
'my subject'
After change, the subject header contain this:my subject
, without the utf-8 stuffMy Question: Which of both is the correct Headerform? If is the second, then i must rewrite the Tests in Products.CMFPlone. There are Test with checks like this
self.assertEqual(msg['Subject'], 'my subject')
. Do you know what i mean?
Both are equivalent.
Because my subject
is ASCII only, the "utf-8" stuff is not necessary
and therefore maybe omitted with the more modern policy.
Header parts must be encoded when they contain non ASCII characters (and some special constructs that might be misinterpreted as an encoding). The encoding must start and end at word boundaries. A header line (which may contain an encoded part) must respect the maximal line length. There is a special treatment for overlong words requiring encoding. Apart from those limitations, the standard gives much freedom regarding encoding decisions. E.g. ASCII words may or may not be encoded (even though not encoding them is more readable and therefore preferable).
I see two options:
stay with the more modern policy and rewrite the failing tests
use a new policy derived from Compat32
with SMTP line endings
(then you should get as only difference the standard conform
line endings)
I will check this: compat_SMTP = policy.compat32.clone(linesep='\r\n')
With the changes above, the results of sending via localhost and exchange are fine. The Mails aren't corrupt.
Another stupid question: How can i run the Tests in Product.MailHost? bin/test -s ProductsMailHost -t AllHeaders
doesn't work in my coredev buildout installation. If i use python3 -m unittest tests/testMailHost.py
i get an six import error.
I use tox
to run tests.
e.g. tox -e py38 -- -t testAllHeaders
Also see https://github.com/jugmac00/til/blob/master/zope/how-to-run-a-single-test.md
1letter wrote at 2021-1-11 00:42 -0800:
Another stupid question: How can i run the Tests in Product.MailHost?
Most zopefoundation
tests are designed for zope.testrunner
(which likely is activated via your bin/test
).
I assume that the problems you have observed with bin/test
comes
from a limited eggs
definition for the respective part.
Ok, thanks. I forgot to add Products.MailHost to the test-egg section.
@1letter can you please push your changes to a branch (or branches if you change mored than one paackage). I most likely ran into the same issue and would like to test your fix.
@pbauer I'm still rewriting the tests. Some fails yet. I can push a WIP branch for Products.MAilHost and Products.CMFPlone.
@pbauer i have pushed my branches to the repositories Products.MailHost and Products.CMFPlone. @d-maurer one test show up an error but didn't fail:
Error in test testExplicit8bitEncoding (Products.MailHost.tests.testMailHost.TestMailHost)
Traceback (most recent call last):
File "/usr/lib/python3.8/unittest/case.py", line 60, in testPartExecutor
yield
File "/usr/lib/python3.8/unittest/case.py", line 676, in run
self._callTestMethod(testMethod)
File "/usr/lib/python3.8/unittest/case.py", line 633, in _callTestMethod
method()
File "/home/Development/coredev/src/Products.MailHost/src/Products/MailHost/tests/testMailHost.py", line 567, in testExplicit8bitEncoding
mailhost.send('Date: Sun, 27 Aug 2006 17:00:00 +0200\n\n'
File "/home/Development/coredev/src/Products.MailHost/src/Products/MailHost/MailHost.py", line 214, in send
msg, mto, mfrom = _mungeHeaders(messageText, mto, mfrom,
File "/home/Development/coredev/src/Products.MailHost/src/Products/MailHost/MailHost.py", line 465, in _mungeHeaders
return as_bytes(mo), mto, mfrom
File "/home/Development/coredev/src/Products.MailHost/src/Products/MailHost/MailHost.py", line 536, in as_bytes
return msg.as_bytes()
File "/usr/lib/python3.8/email/message.py", line 178, in as_bytes
g.flatten(self, unixfrom=unixfrom)
File "/usr/lib/python3.8/email/generator.py", line 116, in flatten
self._write(msg)
File "/usr/lib/python3.8/email/generator.py", line 181, in _write
self._dispatch(msg)
File "/usr/lib/python3.8/email/generator.py", line 214, in _dispatch
meth(msg)
File "/usr/lib/python3.8/email/generator.py", line 432, in _handle_text
super(BytesGenerator,self)._handle_text(msg)
File "/usr/lib/python3.8/email/generator.py", line 249, in _handle_text
self._write_lines(payload)
File "/usr/lib/python3.8/email/generator.py", line 158, in _write_lines
self.write(lines[-1])
File "/usr/lib/python3.8/email/generator.py", line 406, in write
self._fp.write(s.encode('ascii', 'surrogateescape'))
UnicodeEncodeError: 'ascii' codec can't encode characters in position 3-4: ordinal not in range(128)
32/35 (91.4%)
Error in test test_8bit_special (Products.MailHost.tests.testMailHost.TestMailHostQueuing)
Traceback (most recent call last):
File "/usr/lib/python3.8/unittest/case.py", line 60, in testPartExecutor
yield
File "/usr/lib/python3.8/unittest/case.py", line 676, in run
self._callTestMethod(testMethod)
File "/usr/lib/python3.8/unittest/case.py", line 633, in _callTestMethod
method()
File "/home/Development/coredev/src/Products.MailHost/src/Products/MailHost/tests/testMailHost.py", line 765, in test_8bit_special
mh.send(u"""\
File "/home/Development/coredev/src/Products.MailHost/src/Products/MailHost/MailHost.py", line 214, in send
msg, mto, mfrom = _mungeHeaders(messageText, mto, mfrom,
File "/home/Development/coredev/src/Products.MailHost/src/Products/MailHost/MailHost.py", line 465, in _mungeHeaders
return as_bytes(mo), mto, mfrom
File "/home/Development/coredev/src/Products.MailHost/src/Products/MailHost/MailHost.py", line 536, in as_bytes
return msg.as_bytes()
File "/usr/lib/python3.8/email/message.py", line 178, in as_bytes
g.flatten(self, unixfrom=unixfrom)
File "/usr/lib/python3.8/email/generator.py", line 116, in flatten
self._write(msg)
File "/usr/lib/python3.8/email/generator.py", line 181, in _write
self._dispatch(msg)
File "/usr/lib/python3.8/email/generator.py", line 214, in _dispatch
meth(msg)
File "/usr/lib/python3.8/email/generator.py", line 432, in _handle_text
super(BytesGenerator,self)._handle_text(msg)
File "/usr/lib/python3.8/email/generator.py", line 249, in _handle_text
self._write_lines(payload)
File "/usr/lib/python3.8/email/generator.py", line 155, in _write_lines
self.write(line)
File "/usr/lib/python3.8/email/generator.py", line 406, in write
self._fp.write(s.encode('ascii', 'surrogateescape'))
UnicodeEncodeError: 'ascii' codec can't encode character '\xe4' in position 0: ordinal not in range(128)
33/35 (94.3%)/home/Development/coredev/eggs/pyScss-1.3.7-py3.8-linux-x86_64.egg/scss/types.py:6: DeprecationWarning: Using or importing the ABCs from 'collections' instead of from 'collections.abc' is deprecated since Python 3.3, and in 3.9 it will stop working
from collections import Iterable
/home/Development/coredev/eggs/pyScss-1.3.7-py3.8-linux-x86_64.egg/scss/namespace.py:172: DeprecationWarning: inspect.getargspec() is deprecated since Python 3.0, use inspect.signature() or inspect.getfullargspec()
argspec = inspect.getargspec(function)
/home/Development/coredev/eggs/pyScss-1.3.7-py3.8-linux-x86_64.egg/scss/selector.py:26: FutureWarning: Possible nested set at position 329
SELECTOR_TOKENIZER = re.compile(r'''
Ran 35 tests with 0 failures, 2 errors, 2 skipped in 4.692 seconds.
Running Products.GenericSetup.testing.ExportImportZCMLLayer tests:
Tear down zope.testrunner.layer.UnitTests in 0.000 seconds.
Set up Testing.ZopeTestCase.layer.ZopeLite in 0.000 seconds.
Set up Products.GenericSetup.testing.ExportImportZCMLLayer in 0.000 seconds.
Running:
Ran 12 tests with 0 failures, 0 errors, 0 skipped in 0.009 seconds.
Tearing down left over layers:
Tear down Products.GenericSetup.testing.ExportImportZCMLLayer in 0.002 seconds.
Tear down Testing.ZopeTestCase.layer.ZopeLite in 0.000 seconds.
Total: 47 tests, 0 failures, 2 errors, 2 skipped in 5.051 seconds.
Have you an Idea or can i ignore this?
1letter wrote at 2021-1-11 04:07 -0800:
... Ran 12 tests with 0 failures, 0 errors, 0 skipped in 0.009 seconds. Tearing down left over layers: Tear down Products.GenericSetup.testing.ExportImportZCMLLayer in 0.002 seconds. Tear down Testing.ZopeTestCase.layer.ZopeLite in 0.000 seconds. Total: 47 tests, 0 failures, 2 errors, 2 skipped in 5.051 seconds.
Have you an Idea or can i ignore this?
You should definitely not ignore tests with an error.
The errors occur in the as_bytes
method for messages which use
8bit encoding. Apparently, your change broke the message handling for
this case.
Logically, for messages with 8bit encoding, the message body is a sequence of bytes which does not need to be "transfer encoded" (e.g. via "quoted-printable" or "base64") but can be transfered directly.
Internally, Python's email
package represents messages under Python 3
always via str
; to represent a byte, a corresponding surrogate unicode
character is used.
The as_bytes
method contains special logic to detect whether
the original message was obtained from a bytes sequence or from a str
.
In the former case, the original message body is restored; otherwise,
a charset must be applied.
Somehow, your change seems to have disturbed the handling.
Note that the "\r\n" line endings are (likely) important only
for the as_bytes
method.
I have seen that you put your special
policy also in the message_from_string
function.
Many messages presented to a MailHost
will not have such line endings.
I do not know whether your policy will pose problems for such messages.
@d-maurer Ok now i understand the code snippet at the end of MailHost.py. I refactor my code (delete all changes) and only add the policy to the fixed_policy. All Test green now. I will update my branch. @pbauer the patch of Products.CMFPlone is not needed, i deleted my branch.
Edit: I test the mailing via my Test-Mailservers and all Mails are not corrupt.
@d-maurer, @jugmac00 Thanks for your help!
@1letter thanks for the pull-request! I deployed it and it solved the issue for me as well (Plone 5.2.3, Python 3.8, Exchange Mailserver).
I am seeing this problem in a Plone 5.2.10 site on Python 3.8. Specifically the contact form has the originally reported problem, with text like "y=ur" instead of "your". The mail test form works, sending a password reset request works. The mail server is Office365, which may be part of the problem, as I do not see the problem locally.
Is anyone else seeing this currently?
I made a patch for the client, and that works. So if others are seeing this, I can make a PR. It is Plone specific though. In the contact form code we call:
message = MIMEText(message, 'plain', encoding)
Here message
is an encoded string (well, bytes then).
In my patch I change this to:
from Products.MailHost.MailHost import message_from_string
message = message_from_string(message.decode(encoding))
Then it works.
@d-maurer I am inclined to conclude that any code in Plone (or elsewhere in Zope projects) that uses email.message_from_string
should import the fixed function from Products.MailHost
instead. Does that in general sound like a good idea?
The same is probably true for any code that uses MIMEText
to create a message, like in the contact form, but that might depend on how this is used.
I have opened an issue in Plone to track this: https://github.com/plone/Products.CMFPlone/issues/3754 Advice is welcome.
@d-maurer I am inclined to conclude that any code in Plone (or elsewhere in Zope projects) that uses
email.message_from_string
should import the fixed function fromProducts.MailHost
instead. Does that in general sound like a good idea?
Yes -- up to the time, https://github.com/python/cpython/issues/85479 gets fixed.
BUG/PROBLEM REPORT (OR OTHER COMMON ISSUE)
https://community.plone.org/t/plone-5-2-3-and-email-encoding-problem/13181 reports a problem which is likely a consequence of Python's quoted-printable implementation's use of
\n
rather than\r\n
as line ending which does not conform to "https://tools.ietf.org/html/rfc2045#section-6.7". As a consequence, soft line breaks are not recognized correctly in some (likely Windows) environments.This is primarily a Python problem -- but
Products.MailHost
might be able to work around.