magento / magento2

Prior to making any Submission(s), you must sign an Adobe Contributor License Agreement, available here at: https://opensource.adobe.com/cla.html. All Submissions you make to Adobe Inc. and its affiliates, assigns and subsidiaries (collectively “Adobe”) are subject to the terms of the Adobe Contributor License Agreement.
http://www.magento.com
Open Software License 3.0
11.58k stars 9.32k forks source link

Character Encoding in Plain Text Emails Fails since 2.2.8/2.3.0 due to emails no longer being sent as MIME #22103

Closed gwharton closed 5 years ago

gwharton commented 5 years ago

Preconditions (*)

  1. Magento 2.2.8 or 2.3+

Steps to reproduce (*)

  1. Install vanilla magento
  2. Setup valid store email address
  3. Create a new email template. Name : "Newsletter Success", Subject : "Newsletter subscription success", Template Content : "You have been successfully subscribed to our newsletter. ö£€"
  4. Convert email template to Plain Text and Save
  5. Config -> Customer -> Newsletters -> Success Email Template -> "Newsletter Success"
  6. Clear cache
  7. Go to home page and subscribe to newsletter

Expected result (*)

  1. Emails received are MIME emails.
  2. All the essential MIME headers such as Content-Type, Content-Transfer-Encoding are preserved.
  3. Special Characters are encoded correctly.

Actual result (*)

  1. Emails received are not MIME emails.
  2. All the essential MIME headers such as Content-Type, Content-Transfer-Encoding are missing.
  3. Special characters may not be encoded properly.

This is new behaviour introduced in 2.2.8. See below for comparison of 2.2.7 2.2.8 and 2.3-develop

2.2.7

Email client shows plain text email You have been successfully subscribed to our newsletter. ö£€

Received Email source (some headers clipped)

To: test@domain.com
Subject: Newsletter subscription success
From: Owner <test@domain.com>
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: quoted-printable
Content-Disposition: inline
MIME-Version: 1.0
Message-Id: <20190408200546.29448C1894@domain.com>
Date: Mon,  8 Apr 2019 21:05:46 +0100 (BST)

You have been successfully subscribed to our newsletter. =C3=B6=C2=A3=E2=
=82=AC

2.2.8

Email client shows plain text email You have been successfully subscribed to our newsletter. ö£€

Received Email source (some headers clipped)

To: test@domain.com
Subject: Newsletter subscription success
Date: Mon, 08 Apr 2019 20:33:03 +0000
From: test@domain.com
Message-Id: <20190408203304.889F5C16FB@domain.com>

You have been successfully subscribed to our newsletter. ö£€

2.3-develop

Email client shows plain text email You have been successfully subscribed to our newsletter. ö£€

Received Email source (some headers clipped)

To: test@domain.com
Subject: =?utf-8?Q?Newsletter=20subscription=20success?=
Date: Mon, 08 Apr 2019 20:34:39 +0000
From: =?utf-8?Q?Owner?= <test@domain.com>
Message-Id: <20190408203439.7FD9AC16FB@domain.com>

You have been successfully subscribed to our newsletter. ö£€
m2-assistant[bot] commented 5 years ago

Hi @gwharton. Thank you for your report. To help us process this issue please make sure that you provided the following information:

Please make sure that the issue is reproducible on the vanilla Magento instance following Steps to reproduce. To deploy vanilla Magento instance on our environment, please, add a comment to the issue:

@magento-engcom-team give me 2.3-develop instance - upcoming 2.3.x release

For more details, please, review the Magento Contributor Assistant documentation.

@gwharton do you confirm that you was able to reproduce the issue on vanilla Magento instance following steps to reproduce?

m2-assistant[bot] commented 5 years ago

Hi @engcom-backlog-nazar. Thank you for working on this issue. In order to make sure that issue has enough information and ready for development, please read and check the following instruction: :point_down:

ghost commented 5 years ago

Hello @gwharton . We've been not able to reproduce Your issue following the steps Your provided. Herewith we attach three videos on how we were trying to do that. So please watch them carefully and tell us what we missed or did incorrectly there. Thank You. Video1 Video2 Video3

gwharton commented 5 years ago

Can you view source on the email you received and compare a 2.2.7 and 2.2.8 email with the exact same settings.

I see your email client is recognising the email as UTF-8, but in the emails I receive, I don't see any headers that describe the body of the email as being UTF-8 encoded. My email client is set to auto detect encoding type, as most clients are by default.

Do you manually set your email client to UTF-8 or is it auto detecting UTF-8?

Here is what I see when I view source in 2.2.8. You can see the Subject and From fields are UTF-8 encoded, but there are no MIME headers, or any content type headers to tell the email client that the email is encoded in UTF-8.

Return-Path: <test@domain.com>
Received: by dev.domain.com (Postfix, from userid 33)
    id 7FD9AC16FB; Mon,  8 Apr 2019 21:34:39 +0100 (BST)
To: graham@domain.com
Subject: =?utf-8?Q?Newsletter=20subscription=20success?=
Date: Mon, 08 Apr 2019 20:34:39 +0000
From: =?utf-8?Q?Owner?= <test@domain.com>
Message-Id: <20190408203439.7FD9AC16FB@dev.domain.com>

You have been successfully subscribed to our newsletter. ö£€

Shouldn't there be headers like Content-Type: text/plain; charset=utf-8

Edit : If i manually change my email client to treat all incoming emails as Unicode instead of auto detecting, the email displays correctly. So, my comment stands, the missing Content-type header seems the culprit.

When viewing source, also notice that since 2.2.8 and 2.3.0 the following headers are also missing for plain text emails, hence me commenting, emails no longer sent as MIME.

Content-Transfer-Encoding: quoted-printable
Content-Disposition: inline
MIME-Version: 1.0

You can clearly see this in the source code. 2.2.7 and below used zf1 and 2.2.8/2.3.0 use zf2. In the 2.2.8/2.3.0 framework functions, plain text emails are sent completely raw with no headers. At a minimum i would expect to see content-type text/plain headers.

Take a look at the Zend Framework 1, setBodyText function below (part of zf1 library)

    public function setBodyText($txt, $charset = null, $encoding = Zend_Mime::ENCODING_QUOTEDPRINTABLE)
    {
        if ($charset === null) {
            $charset = $this->_charset;
        }

        $mp = new Zend_Mime_Part($txt);
        $mp->encoding = $encoding;
        $mp->type = Zend_Mime::TYPE_TEXT;
        $mp->disposition = Zend_Mime::DISPOSITION_INLINE;
        $mp->charset = $charset;

        $this->_bodyText = $mp;

        return $this;
}

Compare this to the newly implemented setBodyText function in 2.2.8/2.3.0 (implemented in Framework/Mail)

public function setBodyText($text)
    {
        $this->setMessageType(self::TYPE_TEXT);
        return $this->setBody($text);
    }

Prior to 2.2.8 using zf1, the setBodyText and setBodyHtml functions were almost identical. Both created a MIME email with the charcterset, disposition, encoding and type. The only difference between the two being the type set to either text/plain or text/html.

Now since 2.2.8/2.3.0, the setBodyText and setBodyHtml implemented in Framework/Mail functions are very different. setBodyHtml does the same as the original setBodyHtml did in zf1, but the setBodyText is vastly cut back, only setting the type. It does not set disposition, encoding, characterset or mime. Was this intentional, or did it just slip through because no-one tests using plain text emails?

AlexWorking commented 5 years ago

@gwharton the mail client we used is Crome browser if You mean That and we hadn't set manually any encoding related things. Here are the headers we've got: Headers (Magento version 2.3-develop)

gwharton commented 5 years ago

Yes, thats what I see too.

The email is missing the following headers

Content-Transfer-Encoding: quoted-printable
Content-Disposition: inline
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8

These are present in versions prior to 2.2.8, but are not present in plain text emails sent by 2.2.8+ or 2.3.0+

In the video you posted, there is a little drop down box on your email client which sets the email to "UTF-8". I believe this is forcing the email to be interpreted as UTF-8 encoded which forces it to be displayed correctly. Mozilla Thunderbird (and most other email clients i would imagine) attempt to automatically determine the content encoding type by looking for the content type email headers (which have been missing since 2.2.8/2.3.0 for plain text emails)

So, in response, I say you have reproduced the issue, its just that the email client you are using is forcing all emails to UTF-8, which is hiding the fact that the emails are missing these important headers.

If you repeat the test using version 2.2.7, and look at the email source headers, you will see the missing headers are present.

AlexWorking commented 5 years ago

@gwharton The headers are missing but seems it does not effect in incorrect displaying results. Could't it be that there's a problem with Your mail client auto-detect capabilities? specharem

gwharton commented 5 years ago

I think the fact that your email client is able to automatically detect the content of the email as UTF-8 encoded and display it correctly is a matter of luck, not because Magento is following the specification (which it isn't).

Currently Magento 2.2.8+ and 2.3.0+ is sending plain text emails out in RFC822 format (Non MIME) which supports ONLY the US-ASCII character set. Your email client see's the presence of non US-ASCII characters in the message body and "has a go" at interpreting these as UTF-8, which it succeeds at. The email itself does not conform to RFC822 because it uses characters outside of US-ASCII.

To properly send UTF-8 characters in an email according to the specification, the email must conform to RFC2045 and be sent as a MIME message, with the appropriate MIME, content encoding, content type headers.

Currently plain text emails sent by Magento 2.2.8+/2.3.0+ with UTF-8 characters in them, are neither RFC822 or RFC2045 compliant.

So please consider this from the point of view of "Are the emails correctly formed in accordance to RFC822 (Answer : NO) or RFC2045 (Answer : NO)" and not "do they work with my email client (Answer : Maybe.......Maybe not!)"

unreal4u commented 5 years ago

Today we had some issues as well with the Magento v2.2.8 upgrade. The encoding doesn't seem to be happening so reaaaaally long lines of CSS gets included. For PHP's mail() this is fine, exim (and possibly other MTA's) do not like this, failing the mail (thus NOT delivering it!) with the message:

2019-04-24 11:25:07 1hJE9L-0005Fa-XX <= sales@XXXXX.XX U=XXXXX P=local S=10308
2019-04-24 11:25:07 1hJE9L-0005Fa-XX ** XXXXXXXXXX@XXXXX.XX R=smarthost T=remote_smtp_smarthost: message is too big (transport limit = 1)
2019-04-24 11:25:07 1hJE9L-0005Fa-XX ** XXXXXX@XXXXX.XX R=smarthost T=remote_smtp_smarthost: message is too big (transport limit = 1)

If I look at the differences between v2.2.7 and v2.2.8, it is indeed that the message isn't being encoded properly:

v2.2.7:

Screenshot 2019-04-24 at 17 24 09

v2.2.8:

Screenshot 2019-04-24 at 17 23 59

So I agree with @gwharton : mails are NOT being sent in the proper format, which may bring a LOT of issues with different environments, settings, mail clients and others.

I still have to pinpoint though where this change comes from.

gwharton commented 5 years ago

@unreal4u Yeah, 2.2.7 used zend framework version 1 and everything was fine. 2.2.8 has been rewritten to use zend framework 2 for mail so it matches 2.3.0+, and infact, some of the mail functions that were originally part of the zend framework, have now been brought into Magento/Framework and look as if they have been implemented incorrectly, especially for plain text. Also check out #22065 which covered some encoding issues with html emails, which got closed down as not reproducable. This is more likely to be the cause of your issue and would be good if that one could be resurrected if you can indeed reproduce it.

magento-engcom-team commented 5 years ago

:white_check_mark: Confirmed by @AlexWorking Thank you for verifying the issue. Based on the provided information internal tickets MAGETWO-99426 were created

Issue Available: @AlexWorking, You will be automatically unassigned. Contributors/Maintainers can claim this issue to continue. To reclaim and continue work, reassign the ticket to yourself.

arnoudhgz commented 5 years ago

I had the same / similar issue with special characters not being set properly. It was due to a dotmailer bug. It was solved by updating the extension to 3.2.1: https://github.com/dotmailer/dotmailer-magento2-extension#fixes

tuyennn commented 5 years ago

I have same issue with special characters as Persian or Arabic not being encoded properly. All my clients email templates were broken down as no styling or wrong styling especially the transaction email.

davidverholen commented 5 years ago

setting the encoding to utf-8 seems to be the first try to fix this over here: https://github.com/magento/magento2/blob/2.3-develop/app/code/Magento/Email/Model/Transport.php#L90

Still it's missing some additional headers as mentioned here: https://github.com/magento/magento2/issues/22103#issuecomment-486178910

This could be fixed in the same place, by adding these headers. But I think that's not the best place. Instead the implementation of https://github.com/magento/magento2/blob/2.3-develop/lib/internal/Magento/Framework/Mail/Message.php should probably be fixed to set the correct headers on the zend message?

davidverholen commented 5 years ago

I do not have a real fix yet, but I managed to get a correct Mail again. So the first problem seems to be the line here which "drops" the mime parts from the body and so the new zend message object has a lot less information https://github.com/magento/magento2/blob/2.3-develop/app/code/Magento/Email/Model/Transport.php#L90

the second problem then (when reusing the original zend message) is that the encoding, charset and disposition is not set correctly on the body parts.

gwharton commented 5 years ago

I think the fix should be at the following

https://github.com/magento/magento2/blob/72aebb7051eb34058b9d25c18ab5fe836f8ae305/lib/internal/Magento/Framework/Mail/Message.php#L59-L66

Implement the appropriate block for plain text mails, e.g

   if (is_string($body) && $this->messageType === MailMessageInterface::TYPE_TEXT) {
            $body = self::createTextMimeFromString($body);
        }

and then implement the createTextMimeFromString function in the same way that createHtmlMimeFromString is

Although the Plain text and Html versions should really be identical, apart from setting the TYPE, so I'm not sure you need two functions. Perhaps just a createMimeFromString, and then a setType call to set Plaintext/Html header. Seems to be way too over complicated as far as I can see.

davidverholen commented 5 years ago

ah right. It's not working for html was actually a project specific bug with an email attachment implementation.

The probably something like this should work:

    public function setBody($body)
    {
        if (is_string($body)) {
            $body = self::createMimeFromString($body, $this->messageType);
        }
        $this->zendMessage->setBody($body);
        return $this;
    }

    /**
     * Create HTML mime message from the string.
     *
     * @param string $body
     * @param string $messageType
     * @return \Zend\Mime\Message
     */
    private function createMimeFromString($body, $messageType)
    {
        $part = new Part($body);
        $part->setCharset($this->zendMessage->getEncoding());
        $part->setType($messageType);
        $mimeMessage = new \Zend\Mime\Message();
        $mimeMessage->addPart($part);
        return $mimeMessage;
    }
tuyennn commented 5 years ago

The Zend mail 2 follows rfc-2822 standard and only allow header value contains valid characters.

vendor/zendframework/zend-mail/src/Header/HeaderValue.php
/**
     * Determine if the header value contains any invalid characters.
     *
     * @see    http://www.rfc-base.org/txt/rfc-2822.txt (section 2.2)
     * @param string $value
     * @return bool
     */
    public static function isValid($value)
    {
        $total = strlen($value);
        for ($i = 0; $i < $total; $i += 1) {
            $ord = ord($value[$i]);

            // bare LF means we aren't valid
            if ($ord === 10 || $ord > 127) {
                return false;
            }

            if ($ord === 13) {
                if ($i + 2 >= $total) {
                    return false;
                }

                $lf = ord($value[$i + 1]);
                $sp = ord($value[$i + 2]);

                if ($lf !== 10 || ! in_array($sp, [9, 32], true)) {
                    return false;
                }

                // skip over the LF following this
                $i += 2;
            }
        }

        return true;
    }

Thus the email was not encoded the header name will always be more than 128 length. Then we will always have exception while sending mail even the Sender Name just contains 1 character of non English. Ex: تست

https://github.com/magento/magento2/issues/22778

m2-assistant[bot] commented 5 years ago

Hi @gwharton. Thank you for working on this issue. Looks like this issue is already verified and confirmed. But if you want to validate it one more time, please, go though the following instruction:


magento-engcom-team commented 5 years ago

Hi @gwharton. Thank you for your report. The issue has been fixed in magento/magento2#23537 by @gwharton in 2.2-develop branch Related commit(s):

The fix will be available with the upcoming 2.2.10 release.

magento-engcom-team commented 5 years ago

Hi @gwharton. Thank you for your report. The issue has been fixed in magento/magento2#23535 by @gwharton in 2.3-develop branch Related commit(s):

The fix will be available with the upcoming 2.3.3 release.

bobemoe commented 5 years ago

Any idea how I can patch 2.2.9 with this? Or when 2.2.10 is to be released?

I've tried applying the above commits as patches but some of the hunks fail:

$ curl https://github.com/magento/magento2/commit/fa030f284662ea698ea149b1b5de50cc9ea10e68.patch | patch -p5 -d ~/public_html/vendor/magento/framework/
patching file Mail/Message.php
Hunk #1 succeeded at 20 (offset -3 lines).
Hunk #2 FAILED at 58.
Hunk #3 succeeded at 176 (offset 15 lines).
Hunk #4 FAILED at 176.
Hunk #5 FAILED at 190.
3 out of 5 hunks FAILED -- saving rejects to file Mail/Message.php.rej
bobemoe commented 5 years ago

Ignore me! I was using the wrong commits. This wortked:

$ curl -s https://github.com/magento/magento2/commit/dfdbe7cc4b94c103ab66383577e13b91de395dff.patch | patch -p5 -d ~/public_html/vendor/magento/framework/
patching file Mail/Message.php
$ curl -s https://github.com/magento/magento2/commit/e6fbf99d409eb77e86266e008ff23db9276f3fb2.patch | patch -p5 -d ~/public_html/vendor/magento/framework/
patching file Mail/Test/Unit/MessageTest.php
$ curl -s https://github.com/magento/magento2/commit/235cabaa912aba3ba3b5f0b96ef6ab0105859c27.patch | patch -p5 -d ~/public_html/vendor/magento/framework/
patching file Mail/Message.php
patching file Mail/Test/Unit/MessageTest.php
ionutdicu commented 5 years ago
Override vendor/magento/framework/Mail/Message.php:165:
 private function createHtmlMimeFromString($htmlBody)
    {
        $htmlPart = new Part($htmlBody);
        //Add this: 
        $htmlPart->setEncoding(Mime::ENCODING_BASE64);
       ...
    }
bobemoe commented 5 years ago

This didn't actually fix the problem for me. I needed this as well https://github.com/magento/magento2/pull/23650

gwharton commented 5 years ago

@bobemoe, yes, This PR changes the emails so they are sent as MIME, but does not adjust the encoding. PR #23650 changes the encoding. Both are required together to get the functionality back to what we had in 2.2.7

@ionutdicu I wouldnt recommend setting the encoding to BASE64 as in your example. Follow PR #23650 which sets the encoding properly for both html and plain text emails.