jhillyerd / enmime

MIME mail encoding and decoding package for Go
MIT License
453 stars 101 forks source link

'slash after first token' when parsing some emails #171

Closed miknaz closed 3 years ago

miknaz commented 3 years ago

What I did: parsed some mimes

What I expected: to get parsed object

What I got: error

Release or branch I am using: [0.8.2] - 2020-10-10

(Please attach a sample message if you feel it will help reproduce the issue)

First of all, thank for the helpful package! But I get error when I parse some mimes: expected slash after first token I parse them like this: enmime.ReadEnvelope(strings.NewReader(mime))

Can you suggest me please how to avoid this problem ? Below is the example of mime which gives error:

Return-Path: <bounce+aca41a.9415d2-general.c1c=libero.it@libertycomc.com>
Delivered-To: general.c1c@libero.it
Received: from dcd-15 ([10.103.10.7])
    by dcbackend-15.iol.local with LMTP id wBOnApYfll9gIAUATByfJw
    for <general.c1c@libero.it>; Mon, 26 Oct 2020 02:00:06 +0100
Received: from dcp-33.iol.local ([10.103.10.7])
    by dcd-15 with LMTP id qDyMApYfll/wYgAAkA0XfQ
    ; Mon, 26 Oct 2020 02:00:06 +0100
Received: from libero.it ([10.103.10.7])
    by dcp-33.iol.local with LMTP id MPfpG5Mfll+4JAEAVzGdtA
    ; Mon, 26 Oct 2020 02:00:06 +0100
Received: from so254-8.mailgun.net ([198.61.254.8])
    by smtp-07.iol.local with ESMTP
    id Wqrkk7ui19msRWqrlkdWIE; Mon, 26 Oct 2020 02:00:06 +0100
X-IOL-DMARC: Dominio libertycomc.com non supporta DMARC
X-IOL-DKIM: pass con il dominio d=libertycomc.com
X-IOL-SPF: pass con l'IP 198.12.44.3;libertycomc.com
X-IOL-SEC: _SPFOK_DKIMOK_NODMARC
x-libjamoibt: 2601
Received-SPF: pass
X-CNFS-Analysis: v=2.4 cv=KPrksHJo c=1 sm=1 tr=0 ts=5f961f96 b=1
 a=9V+36KcF1VtsQasMD0NMeA==:117 a=9V+36KcF1VtsQasMD0NMeA==:17
 a=IkcTkHD0fZMA:10 a=afefHYAZSVUA:10 a=5KLPUuaC_9wA:10 a=swULO2ODAAAA:8
 a=8IFCRQWyzuk3np_0jKQA:9 a=AFaUXfkzamQ7vrcO:21 a=frz4AuCg-hUA:10
 a=QEXdDO2ut3YA:10 a=VPRn9Uh7xMAA:10 a=PCWqMptlEwVzXmuMJime:22
Authentication-Results: smtp-07.iol.local;
    dkim=pass header.d=libertycomc.com header.b=TfJm39aZ
DKIM-Signature: a=rsa-sha256; v=1; c=relaxed/relaxed; d=libertycomc.com; q=dns/txt;
 s=krs; t=1603674005; h=Content-Transfer-Encoding: Mime-Version:
 Content-Type: Subject: From: To: Message-Id: Sender: Date;
 bh=faAL8rCmOc2K1K3IP+7qQSPKUakoqX/vM3mtikLFN0I=; b=TfJm39aZ1YvR3TVhRaf5BOcqO6N5ED/Kch55jvSrHK95UTu6KuKpHzAamCG9yJ4ithTOrC53
 2qQhp5E+1c7Jyt7FMAYxKXhApcqBWd7wj5lmytO+ceEb+x6+BcgmWg2+knuXwexuDlYCFxKM
 fwnBUlp5eb5+Mr3YmWVOptF/r7c=
X-Mailgun-Sending-Ip: 198.12.44.3
X-Mailgun-Sid: WyJkMDIzMiIsICJnZW5lcmFsLmNvbnN0cnVjdGlvbkBsaWJlcm8uaXQiLCAiOTQxNWQyIl0=
Received: by luna.sendgrid.net with HTTP; Mon, 26 Oct 2020 01:00:04 +0000
Date: Mon, 26 Oct 2020 01:00:04 +0000
Sender: info@libertycomc.com
Message-Id: <20201033010004.1.C8D1569B39447CD0@libertycomc.com>
To: general.c1c@libero.it
From: LibertyComc <info@libertycomc.com>
Subject: =?utf-8?q?Il_tuo_contratto_di_assistenza_LibertyOnCall_Base_?=
 =?utf-8?q?=C3=A8_scaduto?=
Content-Type: text/html; charset="utf-8"
Mime-Version: 1.0
Content-Transfer-Encoding: quoted-printable
X-CMAE-Envelope: MS4xfNvTvUqy52h0PkJMOD3B7ZoQxPkRJk/JMo2kfmIlF7l6O3wP4UkVrLsi6TBS/nxvz1zrbHrMhIiA9gTN2dELm5hM39y496Dn/14V6nFuqnxTAsnInGBo
 sNc1MmqJGgJu/m9Pgn18xzac7TKMqBPiPCe96YDU4K/A0uPOppvxJ55qoey1cHx+NwBsPzZ1VXDTLf1q+eVnP3cjfhUKW5ii5fBVuHTYbCbfNz0NmEwhpZCc
 KO9wgeZjEWpMkfZ+OzlinhYnRpDDLNYt3vmNI3J9Bsk=

<!-- BEGIN: main --><!DOCTYPE html>
<html lang=3D"it">
<head>
   <meta charset=3D"UTF-8">
   <title>Il tuo contratto di assistenza LibertyOnCall Base =C3=A8 scaduto<=
/title>
   <style type=3D"text/css">
      body {
         font-family: Verdana, Arial, Helvetica, sans-serif;
         font-size: 11px
      }
   </style>
</head>
<body>
   <div style=3D"width:100%; background-color:#f7f7f4; padding-bottom:30px;=
 padding-top:10px; margin:0">
      <div style=3D"width:640px; margin-left:auto; margin-right: auto; back=
ground-color:#ffffff;box-shadow: 0 10px 20px rgba(0,0,0,0.19), 0 6px 6px rg=
ba(0,0,0,0.23);">
         <div style=3D"padding:10px; margin:5px 0 0 0; border-bottom:1px so=
lid #dddddd; font-size:11px">
            <img src=3D"http://www.libertycomc.com/assets/email/logo_lib=
ertycommerce_piccolo.png">
         </div>
         <div style=3D"color:#333333; font-weight:bold; background-color:#e=
fefef; padding:5px 10px 8px 10px; font-size:15px; font-family:Arial,Verdana=
,sans-serif">
            Il tuo contratto di assistenza LibertyOnCall Base =C3=A8 scaduto
         </div>
         <div style=3D"color:#333333; padding:10px 10px 10px 10px; font-siz=
e:15px; font-family:Arial,Verdana,sans-serif">
           =20
<table width=3D"100%" border=3D"0" cellspacing=3D"2" cellpadding=3D"5">
    <!--
    <tr>
        <td align=3D"center" bgcolor=3D"#FF3F3F">
            <span style=3D"color: cornsilk; font-weight: bold"> Il tuo contratto di =
assistenza =C3=A8 scaduto </span>
        </td>
    </tr>
    -->

    <tr>
        <td>

            Gentile <strong>General C1C</strong>, <b=
r>
            <br>
            dai nostri archivi risulta che il contratto di assistenza
            <strong>LibertyOnCall Base</strong> da te sottoscritto =C3=A8 scaduto il=
 giorno <strong>23/10/2020</strong>.<br>
            <br>

            Per poter continuare ad usufruire del servizio ti invitiamo a rinnovarlo=
 per un altro anno. <br>
            <br>
            <div align=3D"center" style=3D"margin:10px">
                <a style=3D"text-decoration:none; font-weight: bold; background-color:#=
eee; color:#004276; border-radius: 4px; padding:8px; border:1px solid #bbbb=
bb" href=3D"http://legacy.libertycomc.com/negozio/quickbuy.php?ql=3D7B65=
D7E0A97BC35F9B576510F11119EC1B8E3205D8B08993E8DC641CC36F7FD38A8E4E219480742=
C529F7D7182734594A769813DF58839940268A486990C73722FC20809D093A41EB5261F104D=
301413CC604233EEE6B61139D18A8B94461795">Clicca per rinnovare</a>
            </div>

        =09
            <br>
        </td>
    </tr>
</table>

         </div>
         <div style=3D"text-align:center; background-color:#f6f6f6; padding=
:10px; margin:5px 0 5px 0; border-top:1px solid #dddddd; font-size:10px;  f=
ont-family:Arial,Verdana,sans-serif">&copy;2020 Liberty Line srl</div>
      </div>
   </div>
<img width=3D"1px" height=3D"1px" alt=3D"" src=3D"http://email.libertycomme=
rce.it/o/eJw1zDsSgyAUBdDVhJK5j59S0EQ3wueZMKMyg6TI7mOT5pSnBG_IFiVqUFAgKHcDGE=
lymVearH9qD5qWFQ-DvSbu45vbcXDPLOsQ7-AjJ1J-0yZats5gS4gahXl2cKxFDy8-ucdd5nZeo=
3_yqO38d-1efrQHKFQ"></body>
</html><!-- END: main -->
jhillyerd commented 3 years ago

Thanks for submitting this. I have not tested, but believe this may be caused by a bug in how we handle continuations. We introduced some code that attempts to detect incorrect continuations (aka headers with an accidental space in front of them), and it may be getting confused here.

If you have time, you could try building the mime-dump command in our repo, and running it against this email. Then remove one of the multi-line headers and see if it still triggers. Once we know which header is triggering it, we can figure out how to fix enmime. If you don't have time or inclination to do this, I will take a crack at it on the weekend.

dcormier commented 3 years ago

My money's on this header right here being the problem:

DKIM-Signature: a=rsa-sha256; v=1; c=relaxed/relaxed; d=libertycomc.com; q=dns/txt;
 s=krs; t=1603674005; h=Content-Transfer-Encoding: Mime-Version:
 Content-Type: Subject: From: To: Message-Id: Sender: Date;
 bh=faAL8rCmOc2K1K3IP+7qQSPKUakoqX/vM3mtikLFN0I=; b=TfJm39aZ1YvR3TVhRaf5BOcqO6N5ED/Kch55jvSrHK95UTu6KuKpHzAamCG9yJ4ithTOrC53
 2qQhp5E+1c7Jyt7FMAYxKXhApcqBWd7wj5lmytO+ceEb+x6+BcgmWg2+knuXwexuDlYCFxKM
 fwnBUlp5eb5+Mr3YmWVOptF/r7c=
miknaz commented 3 years ago

@dcormier yes yes you are right ! I also removed this header and then I don't get the error! Need to fix it somehow )

jhillyerd commented 3 years ago

Given the discussions we had in #159 -- I am in favor of removing the header continuation smarts entirely. Better to just follow the RFCs.

miknaz commented 3 years ago

@jhillyerd oh sorry I did not go so deeply into RFC. Can you, please, explain me exactly what it means. I guess long headers split with some char like '\n'. And RFCs provides some valid characters to split long headers ?

dcormier commented 3 years ago

@jhillyerd, by that do you mean reverting both #149 and #166?

dcormier commented 3 years ago

Can you, please, explain me exactly what it means. I guess long headers split with some char like '\n'. And RFCs provides some valid characters to split long headers ?

Yeah, @miknaz. The RFCs provide for folding headers across multiple lines by using \r\n followed by horizontal whitespace (one or more space or tab characters). The rest of that line is considered a continuation of the header from the previous line.

Since this package tries to handle emails that don't quite conform to the RFCs, one of the things in it currently is to try to catch headers that are just incorrectly indented, rather than being actual header continuation lines (since this is something we saw in some real emails in production, which lead @requaos to write #149 to handle that).

miknaz commented 3 years ago

@dcormier thank you for the explanation! But as I understand guys here suggest to revert some fixes in package ? Or what the ways to fix this do we have ?

jhillyerd commented 3 years ago

@dcormier Yeah, I think reverting both of those (preserving any good test cases that would still pass) is the best way forward.

If we find samples that fail after that, perhaps just making our individual header parsing more tolerant to that scenario would work better.

requaos commented 3 years ago

I'm just catching up on the changes in #166 and #159

miknaz commented 3 years ago

Hi guys! Let me clarify if somebody is going to fix this in near time and we can expect new update or ? Because I would not like to try it by myself to not destroy something) but otherwise I will need to try to fix it ..(

jhillyerd commented 3 years ago

I will likely give it a shot this weekend, although it should be simple fix. enmime has very good unit test coverage, so it will let you know if you broke it. :)

jhillyerd commented 3 years ago

Please test with new master branch and let us know if this is resolved, then we can cut a release.

miknaz commented 3 years ago

@jhillyerd yeah now I don't get that error! Good job, great thank you !