Closed laurensmartina closed 3 years ago
Looks good at first glance. I think it would be good to also link to the relevant spec or RFC in a coment, making it easier to verify the implementation later. Also, did you consider using the more readable quoted-printable encoding, as suggested here? https://stackoverflow.com/a/54688095/740048
Also, did you consider using the more readable quoted-printable encoding, as suggested here? https://stackoverflow.com/a/54688095/740048
Implemented the quoted-printable fix
This is defined in https://tools.ietf.org/rfc/rfc2047.html
It seems like the current implementation looks conformant, except that the quoted_printable_encode
PHP function actually implements the RFC2045 version of quoted-printable, which is a bit more complicated (it handles line breaks differently, but there might be other differences that we should technically correct for). Also, RFC2047 poses a requirement on maximum (line) lengths:
While there is no limit to the length of a multiple-line header field, each line of a header field that contains one or more 'encoded-word's is limited to 76 characters.
Which corresponds to the max line length requirement of RFC2822, which requires breaking on whitespace in the original text.
I wonder if we should just not use quoted_printable_encode
and do our own encoding (which should be simple enough, though I'm not yet sure how to guarantee the max line length).
I would like to get a solution for this merged. We currently have the older base64-based solution on the Stadsbron server, to at least fix this issue, but having an unmerged commit on the live server is a hassle, so let's move this forward.
It's been a while since I looked at this, but I remember that doing the quoted-printable properly involves implementing it ourselves which is not terribly complicated, but needs a thorough reading of the relevant specs. Also, there was some unclarity about maximum lengths for lines and encoded entities, but I can't recall what exactly (I think there is a badly specified rule that any whitespace between quoted-printable elements is ignored, which allows cutting long encoded elements up into multiple ones with a dummy newline in between, to satisfy the pre-quoted-printable-standard line length limits). In any case, this is more work than I'm prepared to invest right now.
So, I would suggest a compromise:
This hurts the plaintext-reasability of such subjects, but e-mail clients should just handle this transparently, and having subjects that are harder to read outside of a proper mailclient is probaly better than not having mail delivered at all.
I've implemented this proposal in an added commit. @laurensmartina could you see if this looks ok to you? I added a TODO for maybe later implementing the quoted printable, with some references.
If this is ok, I'll squash-merge this PR.
Some e-mail providers reject e-mail messages with non-ASCII characters in the subjects.
Fixes #202