Closed adeconsulting closed 7 years ago
I don't think this is a bug. I'm looking at the sample data. Here's a single line from the file:
The undersigned has a good faith belief that use of FOX's property in the m= anner described herein is not authorized by FOX, its agents or the law. Al= so, we hereby state that the information in this notification is accurate a= nd, under penalty of perjury under the laws of the State of California and = the United States, that the undersigned is authorized to act on behalf of F= OX with respect to this matter.=20
The =
characters are not "soft line breaks" because they do not appear at EOL. See RFC 2045 §6.7 for the definition of MIME's quoted-printable. There are a number of relevant passages, including:
(2) An "=" followed by a character that is neither a hexadecimal digit (including "abcdef") nor the CR character of a CRLF pair is illegal. This case can be the result of US-ASCII text having been included in a quoted-printable part of a message without itself having been subjected to quoted-printable encoding. A reasonable approach by a robust implementation might be to include the "=" character and the following character in the decoded data without any transformation and, if possible, indicate to the user that proper decoding was not possible at this point in the data.
As a very simple example, a raw message string which includes the following headers, is used to create an Email::MIME object:
MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable
I then use the body_str() method to retrieve the decoded content:
my $parsed = Email::MIME->new($msg_raw); my $content = $parsed->body_str();
Everything seems to be decoded correctly, e.g. "=3D" and "=20" are replaced with "=" and " " respectively, however, the soft line break characters "= " remain.
The sample email is attached.
Thanks for any assistance with this issue!
_SampleEmail_QuotedPrintable_notDecodingCompletely.txt