Closed paho-outreach closed 3 years ago
I just ran into another side effect of #1113: The LF line endings are now encoded as =0D
in quoted printable messages and I'm seeing a lot of additional empty lines in plain text email messages. I think what happens is that email clients will add a new line for each encoded \r (perhaps it only does this if there is a mix of both \n
and \r\n
line endings?).
Browsers submit text area fields with \r\n endings in Rails and Ruby uses \n endings by default so there's a going to be a lot of effort to ensure we don't get any mixed line endings in there, I would be in favour of having a way to get the old behaviour back.
The problem I'm running into is that pack introduces LF characters, so encoding a message > 998 characters creates something that is not valid per rfc2822, even if you only had CRLF line-endings before.
@paho-outreach Could you show an example message or code snippet that demonstrates this?
The LF line endings are now encoded as =0D in quoted printable messages and I'm seeing a lot of additional empty lines in plain text email messages.
@Fjan Could you provide an example?
I've added a pull request which restores \n normalization at #1210. Please give it a shot as well.
This is what is affecting me, it seems to be related to this issue:
> puts ::Mail.new(body: "€\n1234567890123456789012345678901234567890123456789012345678901234567890123\n", charset: 'UTF-8').encoded
Date: Tue, 06 Mar 2018 00:10:47 +0100
Message-ID: <5a9dce77d36b8_17a2ad9b5be30d4315fa@linobis.mail>
Mime-Version: 1.0
Content-Type: text/plain;
charset=UTF-8
Content-Transfer-Encoding: quoted-printable
=E2=82=AC
1234567890123456789012345678901234567890123456789012345678901234567890123=
> puts ::Mail.new(::Mail.new(body: "€\n123456789012345678901234567890123456789012345678901234567890123456789012\n", charset: 'UTF-8').encoded).encoded
Date: Tue, 06 Mar 2018 00:12:15 +0100
Message-ID: <5a9dcecfb601c_17a2ad9b5be30d432019@linobis.mail>
Mime-Version: 1.0
Content-Type: text/plain;
charset=UTF-8
Content-Transfer-Encoding: quoted-printable
=E2=82=AC=0D
123456789012345678901234567890123456789012345678901234567890123456789012=0D=
> puts ::Mail.new(::Mail.new(::Mail.new(body: "€\n123456789012345678901234567890123456789012345678901234567890123456789012\n", charset: 'UTF-8').encoded).encoded)
Date: Tue, 06 Mar 2018 00:12:37 +0100
Message-ID: <5a9dcee594ee_17a2ad9b5be30d4321a0@linobis.mail>
Mime-Version: 1.0
Content-Type: text/plain;
charset=UTF-8
Content-Transfer-Encoding: quoted-printable
=E2=82=AC=0D
123456789012345678901234567890123456789012345678901234567890123456789012=0D=
=0D
After parsing and encoding this specific message two times it ends up with two \r\r that the browser inteprets as two lines.
Also DKIM and S/MIME signatures get invalidated as the message is effectively changed.
Hi Jeremy, This is an excerpt of a plain text email generated by mail 2.6.6:
Details of the new appointment:
When : Fri 9/3/2018 15:00 to 16:00
Status : Approved (unlimited credit)
ID : 37211227
This is an excerpt of an email generated by mail 2.7.0:
When : Fri 9/3/2018 16:00 to 17:00 =0D
Status : Approved (unlimited credit) =0D
ID : 37211228=0D
We can remove the extra \r
from the source code, however, some of the fields are provided by the user and browsers submit text area fields with \r\n
line endings so it would require filtering all text area fields in every rails app. Note that it is not a problem in HTML messages as those are not white space sensitive. It looks like your fix should do the job, thnx.
Here's a simple example:
require 'mail'
mail = Mail.new do
# need > 998 characters to trigger line-wrapping
body 'aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa'
end
p mail.encoded
Outputs:
"Date: Tue, 06 Mar 2018 08:09:03 -0800\r\nMessage-ID: <5a9ebd1feab47_135e2b181ec490d025463@worktop.mail>\r\nMime-Version: 1.0\r\nContent-Type: text/plain;\r\n charset=UTF-8\r\nContent-Transfer-Encoding: quoted-printable\r\n\r\naaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa=\naaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa=\naaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa=\naaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa=\naaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa=\naaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa=\naaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa=\naaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa=\naaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa=\naaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa=\naaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa=\naaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa=\naaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa=\naaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa=\n"
Pre 2.7.0, the line-wrapping characters would be =\r\n
.
Hi, I had opened this within Rails at https://github.com/rails/rails/issues/32836 but was helped to realize this was a separate gem. I used the default mail version 2.7 that comes with rails new
, but am still finding this is occurring after upgrading to 2.7.1.rc1. The sample forkable app is at https://github.com/uhrohraggy/email-bug - that app still needs either action_mailer config or a SendGrid account, but in case it's helpful.
FYI I opened bug in Ruby https://bugs.ruby-lang.org/issues/14741 - correct me if I'm wrong, but [].pack('M') should return CRLF based on the spec, not just LF
Updated by shyouhei (Shyouhei Urabe) 9 months ago
See also issue #14352. We have decided to reject this change. The behaviour is now documented.
To those who don't read Japanese let me summarize: Ruby's pack behabiour was from Perl's and Perl also don't emit CRLF. This is supposedly because in UNIX variants, emails are written in LF only and, to convert them to CRLF is done in sendmail(1). We don't want to break things here.
So what's the best solution for mail gem?
@nicosticht I think we have to wait for https://github.com/mikel/mail/pull/1210 to get merged, but until then downgrading the mail gem to version 2.6.6 works for me
As far as I can tell the encoded LF produced by the Quoted Printable output of [str].pack("M")
does not comply with RFC2045 which defines quoted printable attachments:
A line break in a text body, represented as a CRLF sequence in the text canonical form, must be represented by a (RFC 822) line break, which is also a CRLF sequence, in the Quoted-Printable encoding.
and
Any octet, except a CR or LF that is part of a CRLF line break of the canonical (standard) form of the data being encoded, may be represented by an "=" followed by a two digit hexadecimal representation
The original implementation did this correctly, so I would suggest either reverting or doing [str].pack("M").gsub('=0D',"\r")
to be compliant with the standard.
@jeremy We have been confronted with exactly this issue when implementing S/MIME signatures for Gitlab. When Unicode characters and quoted-printable
is involved, there's a mismatch between the encoded values used for generating the PKCS signature and what is actually generated in the email, that makes the signatures invalid. All of this works when using just ASCII and 7bit
.
Our basic approach is an interceptor that does something like below (the incoming message is multipart with text and html parts, file samples available in a link at the end):
def delivering_email(message)
signed_message = Gitlab::Email::Smime::Signer.sign(
cert: certificate.cert,
key: certificate.key,
data: message.encoded)
signed_email = Mail.new(signed_message)
overwrite_body(message, signed_email)
overwrite_headers(message, signed_email)
end
def overwrite_body(message, signed_email)
# since this is a multipart email, assignment to nil is important,
# otherwise Message#body will add a new mail part
message.body = nil
message.body = signed_email.body.encoded
end
def overwrite_headers(message, signed_email)
message.content_disposition = signed_email.content_disposition if signed_email.content_disposition
message.content_transfer_encoding = signed_email.content_transfer_encoding
message.content_type = signed_email.content_type
end
I've traced down this to exactly the extra =0D
entries incorrectly converted in the decode/encode steps. I was actually able to edit some of our emails and remove those values, and the signatures become valid. But the only solution that is working consistently is downgrading to 2.6.6, which we'd like to avoid.
I also tried some of the patches listed in this ticket, https://github.com/mikel/mail/issues/1325, and your PR https://github.com/mikel/mail/pull/1210. It improves things but does not solve all cases; some of those patches solve the text/html
part but I still have issues in the text/plain
part. What looks like to be the main problem is that the charset
included in the Content-Type
seems to be ignored in the crlf conversions by Mail::Encodings::QuotedPrintable
class.
For the record, I've checked older email notifications from gitlab.com, and this issue with =0D
extra characters has been around for a long time, I think nobody noticed until now because no signatures where involved. It might also be my misunderstanding of the RFC ofc.
In the corresponding Gitlab issue there's also sample message files pre- and post 2.7: https://gitlab.com/gitlab-org/gitlab/issues/197386#note_273117603
Would be really grateful if you could give us your assessment.
Thanks!
A sample of the failing behaviour:
require 'mail'
email = Mail.new do
to 'nicolas@test.lindsaar.net.au'
from 'Mikel Lindsaar <mikel@test.lindsaar.net.au>'
subject 'This will break encoded'
content_type 'text/plain; charset=UTF-8'
content_transfer_encoding 'quoted-printable'
body "0123456789012345678901234567890123456789012345678901234567890123456789\n"
end
email.body.raw_source
=> "0123456789012345678901234567890123456789012345678901234567890123456789\r\n"
email.body.encoded
=> "0123456789012345678901234567890123456789012345678901234567890123456789=0D=\n\n"
Mail.new(email).body.encoded
=> "0123456789012345678901234567890123456789012345678901234567890123456789=0D=\n=0D\n"
This would be encoded to 71 chars, though with quoted-printable
that becomes 70 + 5:
\n
at the end is translated to \r\n
in raw form, then to =0D\n=
because of the conversion of \r
to encoded =0D
, plus the =
for the soft wrap\n
at the newline soft-wrappedThe endlines for each that should be \r\n
are all over broken after that. When afterwards creating a new Mail
object with the input of the original object, the same broken conversion is applied a second time. This is what breaks our signatures.
I ran into the same issue sending mails with rails and the mail gem in version 2.7.1. Mails containing special chars are sent as quoted printable and all \n
new lines are outputted as:
...=0D\r\n
This additional =0D
leeds to ugly extra new line in Outlook.
By debugging I found out that the raw souce of the body changed between versions 2.7.0 and 2.7.1:
In version 2.7.0 the new lines stay \n
, in version 2.7.1 the new lines are converted to \r\n
before.
This happens because of calling ::Mail::Utilities.to_crlf
on the input string in the initialize function of the Mail::Body
class. This was changed with this commit: Restore LF line ending parsing
May be this conversion needs to be undone in the Mail::Encondings::QuotedPrintable.encode
function by calling ::Mail::Utilities.to_crlf
on str before using pack("M")
to encode the raw source.
This issue is fairly simple: this gem switched to using pack("M")
for encoding, but unfortunately there is a bug in Ruby in that means it's not compliant with the RFC with regard to encoding new lines. The bug is marked "won't fix" in ruby because it would be too disruptive to change it after all these years, which I can understand. The only way forward is to revert the change in the gem to work around the new line encoding so it's compliant with the RFC again.
This monkey patch does that, you can simply drop it in your initialisers directory to get the old behaviour back.
module Mail
module Encodings
class QuotedPrintable < SevenBit
def self.encode(str)
::Mail::Utilities.to_crlf([::Mail::Utilities.to_lf(str)].pack("M"))
end
end
end
end
@Fjan Thank you for sharing your monkey patch and explanations.
I converted it to an module that can be prepended to QuotedPrintable and reuses the super
method of the original encode method:
module Mail
module Encodings
module EncodeQuotedPrintableWithoutCR
def encode(str)
::Mail::Utilities.to_crlf(super(::Mail::Utilities.to_lf(str)))
end
end
end
end
Mail::Encodings::QuotedPrintable.singleton_class.prepend(Mail::Encodings::EncodeQuotedPrintableWithoutCR)
This module is applied as an core extension during initialization of my app.
Fixed by #1210.
I also tried some of the patches listed in this ticket, #1325, and your PR #1210. It improves things but does not solve all cases; some of those patches solve the text/html part but I still have issues in the text/plain part. What looks like to be the main problem is that the charset included in the Content-Type seems to be ignored in the crlf conversions by Mail::Encodings::QuotedPrintable class.
@dlouzan Could you share the cases that #1210 does not solve for you? Perhaps as separate new issues.
@jeremy can we see a release of 2.7.2, please? We are looking at using the monkey patch above in the mean time.
Hi,
I am using the monkey patch and ugly extra new line in Outlook are gone 🙏.
But I have noticed some encoding issues, for example ü
becomes =C3=BC
. Is this normal? Any way to fix this?
Thanks
@elalemanyo That's normal for 7bit encoding, it can't be changed without breaking decoding at the recipient side
@elalemanyo That's normal for 7bit encoding, it can't be changed without breaking decoding at the recipient side
@Fjan ok, but was not like this without the monkey patch, and now I have some failing tests 🤔
In #1113,
QuotedPrintable.encode
was changed fromto
The problem I'm running into is that
pack
introduces LF characters, so encoding a message > 998 characters creates something that is not valid per rfc2822, even if you only had CRLF line-endings before.I see that #1157 prevents attachments from being forced into quoted-printable, and attachment corruption was the motivation for this change in #1113. Would you be open to restoring the old behavior, or does more need to be done in attachment land first?