Automattic / jetpack

Security, performance, marketing, and design tools — Jetpack is made by WordPress experts to make WP sites safer and faster, and help you grow your traffic.
https://jetpack.com/
Other
1.59k stars 799 forks source link

Completely unnecessary %-escapes inside a URL #8254

Open vertizio opened 6 years ago

vertizio commented 6 years ago

Some Jetpack subscription emails trigger a 'Completely unnecessary %-escapes inside a URL' at SpamAssassin.

#### Header start
Return-Path: <info=xxxxxxxxxxxxxxx.com@b.wordpress.com>
Delivered-To: info@xxxxxxxxxxxxxxx.com
Received: from vps.xxxxxxxxxxxxxxxxxxxxx.com
    by vps.jkws.nl with LMTP id 0Au3EiBSDVrLLQAAAttOJw
    for <info@xxxxxxxxxxxxxxx.com>; Thu, 16 Nov 2017 09:53:52 +0100
Return-path: <info=xxxxxxxxxxxxxxx.com@b.wordpress.com>
Envelope-to: info@xxxxxxxxxxxxxxx.com
Delivery-date: Thu, 16 Nov 2017 09:53:52 +0100
Received: from mail by vps.jkws.nl with spam-scanned (Exim 4.89)
    (envelope-from <info=xxxxxxxxxxxxxxx.com@b.wordpress.com>)
    id 1eFFvZ-0003X5-7x
    for info@xxxxxxxxxxxxxxx.com; Thu, 16 Nov 2017 09:53:52 +0100
X-Spam-Checker-Version: SpamAssassin 3.4.1 (2015-04-28) on vps.jkws.nl
X-Spam-Flag: YES
X-Spam-Level: *
X-Spam-Status: Yes, score=1.4 required=1.0 tests=DKIM_SIGNED,DKIM_VALID,
    DKIM_VALID_AU,HTML_MESSAGE,HTTP_EXCESSIVE_ESCAPES,RCVD_IN_MSPIKE_H4,
    RCVD_IN_MSPIKE_WL,URIBL_BLOCKED autolearn=no autolearn_force=no version=3.4.1
X-Spam-Report: 
    *  0.0 URIBL_BLOCKED ADMINISTRATOR NOTICE: The query to URIBL was blocked.
    *       See http://wiki.apache.org/spamassassin/DnsBlocklists#dnsbl-block
    *      for more information.
    *      [URIs: gravatar.com]
    * -0.0 RCVD_IN_MSPIKE_H4 RBL: Very Good reputation (+4)
    *      [192.0.123.41 listed in wl.mailspike.net]
    *  1.5 HTTP_EXCESSIVE_ESCAPES URI: Completely unnecessary %-escapes inside
    *      a URL
    *  0.0 HTML_MESSAGE BODY: HTML included in message
    * -0.1 DKIM_VALID_AU Message has a valid DKIM or DK signature from author's
    *       domain
    *  0.1 DKIM_SIGNED Message has a DKIM or DK signature, not necessarily
    *      valid
    * -0.1 DKIM_VALID Message has at least one valid DKIM or DK signature
    * -0.0 RCVD_IN_MSPIKE_WL Mailspike good senders
Received: from smtp3-2.bur.wordpress.com ([192.0.123.41])
    by vps.jkws.nl with esmtps (TLSv1.2:ECDHE-RSA-AES256-GCM-SHA384:256)
    (Exim 4.89)
    (envelope-from <info=xxxxxxxxxxxxxxx.com@b.wordpress.com>)
    id 1eFFvY-0003Wj-M6
    for info@xxxxxxxxxxxxxxx.com; Thu, 16 Nov 2017 09:53:41 +0100
Received: from wordpress.com (unknown [192.0.91.91])
    by smtp3.bur.wordpress.com (Postfix) with ESMTP id 3ycw5H0TKlzjYTy
    for <info@xxxxxxxxxxxxxxx.com>; Thu, 16 Nov 2017 08:53:39 +0000 (UTC)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=wordpress.com; s=my5;
    t=1510822419; bh=A6XT4tcNI8pEGOZezKBofKlPpo9BTizeCa5f5lneQro=;
    h=Date:To:From:Subject:List-Help:List-Unsubscribe:List-Subscribe:
     List-Archive:From;
    b=Xg7dVPrxdhMNnf8HBKpzTUKzoVUscWzsFLdj9a5GklNVGCzcrWl2kWxpyr0lVUgNz
     7aomCGhzB3/W+0+/DKxhKm5qvb6IJV5LyWogYjJusy0HUjt/qibcij3Ovz+3MYBe78
     XuPqTXKeL9jOP8F5xlv6bobPLdUY+Aj7axeMOt+o=
Date: Thu, 16 Nov 2017 08:53:06 +0000
To: info@xxxxxxxxxxxxxxx.com
From: WordCamp Utrecht <donotreply@wordpress.com>
Subject: *****SPAM***** [Nieuw bericht] Sponsorblog: Siteground
Message-ID: <128222762.2163.0@wordpress.com>
List-Help: <https://en.support.wordpress.com/following/>
List-Unsubscribe: <https://subscribe.wordpress.com/?key=577086142d17bef49a587f68117e6220&amp;email=info%40xxxxxxxxxxxxxxx.com&amp;locale=nl&amp;b=LadR%254CH%25%2C%5BEI9tE2yNX6cDcep%2C%3F%3DI%2BsSFUhmVBLRrScYjqINyi>
List-Subscribe: <https://2017.utrecht.wordcamp.org>
List-Archive: <https://2017.utrecht.wordcamp.org>
Precedence: bulk
X-Automattic-Destination: aW5mb0Bqb3NrbGV2ZXJ3ZWJzdXBwb3J0Lm5s
X-Automattic-Tracking: 0:2:fdIxBHF7bmPlSGpQ2ROr5A==.LC4D/sLBUf6phAjm50VrpmvosNqjEwbYnlBDhG24HAdbk5xZAZHdb+DjH/EZMwCf:128222762:2163:0
MIME-Version: 1.0
Content-Type: multipart/alternative;
    boundary="b1_218a5bf08a6866f5fe695b02c1ecfa22"
X-Antivirus-Scanner: Clean mail though you should still use an Antivirus
X-Spam-Prev-Subject: [Nieuw bericht] Sponsorblog: Siteground
X-EsetId: 37303A299C7A336F617465
#### Header end

Spamscore might be set quite aggressive, but it should not be needed to use unnecessary %-escapes.

#### Message body start

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
    <style type="text/css" media="all">
    a:hover {   color: red; }
    a {
        text-decoration: none;
        color: #0088cc;
    }

    a.primaryactionlink:link, a.primaryactionlink:visited { background-color: #2585B2; color: #fff; }
    a.primaryactionlink:hover, a.primaryactionlink:active { background-color: #11729E !important; color: #fff !important; }

/*
    @media only screen and (max-device-width: 480px) { 
         .post { min-width: 700px !important; }
    }
*/
    </style>
    <title>WordPress.com</title>
    <!--[if gte mso 12]>
    <style type="text/css" media="all">
    body {
    font-family: arial;
    font-size: 0.8em;
    }
    .post, .comment {
    background-color: white !important;
    line-height: 1.4em !important;
    }
    </style>    
    <![endif]-->
    <meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
</head>

<body class='subscription-body-tag'>

<table border="0" cellspacing="0" cellpadding="0" bgcolor="#DDDDDD"  style="width: 100%; background: #DDDDDD;">
    <tr>
        <td>
                    <span style="display:none !important">
                Berry van Es geplaatst:"Our bronze sponsor SiteGround has 13 years of hosting experience and offers WordPress solutions that can help you manage your website easily.

Check their promo cards to learn about their special hosting offer with extra discount for WordCamp Utrecht."           </span>
                        <table border="0" cellspacing="0" cellpadding="0" align='center'  class="subscribe-body" style="width: 100%; padding: 10px">
                <tr>
                    <td>
                                                <div style="direction: ltr; max-width: 600px; margin: 0 auto; overflow: hidden;">
                            <table border="0" cellspacing="0" cellpadding="0" bgcolor="#ffffff"  class="subscribe-wrapper" style="width: 100%; background-color: #fff; text-align: left; max-width: 1024px; min-width: 320px; margin: 0 auto;">
                                <tr>
                                    <td>
                                        <table border="0" cellspacing="0" cellpadding="0" height="8" background="https://s0.wp.com/i/emails/stripes.gif"  class="subscribe-header-wrap" style="width: 100%; background-image: url(https://s0.wp.com/i/emails/stripes.gif); background-repeat: repeat-x; background-color: #43A4D0; height: 8px;">
                                            <tr>
                                                <td></td>
                                            </tr>
                                        </table>

                                        <table border="0" cellspacing="0" cellpadding="0"  class="subscribe-header" style="width: 100%; color: #08c; font-size: 1.6em; background-color: #EFEFEF; border-bottom: 1px solid #DDD; margin: 0; padding: 0;">
                                            <tr>
                                                <td>
                                                    <h2 class="subscribe-title" style="margin: .4em 0 .3em; font-size: 1.8em; font-size: 16px!important; line-height: 1; font-weight: 400; color: #464646; font-family: &quot;Helvetica Neue&quot;, Helvetica, Arial, sans-serif; margin: 5px 20px!important; padding: 0;">
                                                        Nieuwe post op <strong>WordCamp Utrecht</strong>                                                    </h2>
                                                </td>
                                                <td style="text-align: right;">
                                                    <img border="0" class="head-avatar" src="http://s0.wp.com/i/emails/blavatar.png" alt=""  width="32" height="32" / style="vertical-align: middle; margin: 5px 20px 5px 0; vertical-align: middle;">
                                                </td>
                                            </tr>
                                        </table>

                                        <table style="width: 100%"  border="0" cellspacing="0" cellpadding="20" bgcolor="#ffffff">
                                            <tr>
                                                <td>
                                                    <table style="width: 100%"  border="0" cellspacing="0" cellpadding="0">
                                                        <tr>
                                                            <td valign="top" class="the-post">
                                                                                                                                    <table style="width: 100%"  border="0" cellspacing="0" cellpadding="0">
                                                                        <tr>
                                                                            <td style="width: 60px !important; white-space: nowrap; vertical-align: top;">
                                                                                <a href="https://2017.utrecht.wordcamp.org/?author=9781384"  style="text-decoration: underline; color: #2585B2; display: block; margin-right: 10px;"><img border="0" alt='' src='http://1.gravatar.com/avatar/1edc5da2c8521919811ad60ca238cd16?s=50&#038;d=http%3A%2F%2F1.gravatar.com%2Favatar%2Fad516503a11cd5ca435acc9bb6523536%3Fs%3D50&#038;r=G' class='avatar avatar-50' height='50' width='50' /></a>
                                                                            </td>
                                                                            <td>
                                                                                <h2 class="post-title"  style="margin: .4em 0 .3em; font-size: 1.8em; font-size: 1.6em; color: #555; margin: 0; font-size: 20px;"><a href="https://2017.utrecht.wordcamp.org/sponsorblog-siteground/" style="text-decoration: underline; color: #2585B2; text-decoration: none !important;">Sponsorblog: Siteground</a></h2>
                                                                                <span style="color: #888;">Door <a href="https://2017.utrecht.wordcamp.org/?author=9781384"  style="text-decoration: underline; color: #2585B2; color: #888 !important;">Berry van Es</a> </span>
                                                                            </td>
                                                                        </tr>
                                                                    </table>

                                                                <div class="post-content" style="direction: ltr; margin-top: 1em; max-width: 560px;">
                                                                                                                                            <p style="direction: ltr; font-size: 14px; line-height: 1.4em; color: #444; font-family: &quot;Helvetica Neue&quot;, Helvetica, Arial, sans-serif; margin: 0 0 1em;">Our bronze sponsor SiteGround has 13 years of hosting experience and offers WordPress solutions that can help you manage your website easily.</p>
<p style="direction: ltr; font-size: 14px; line-height: 1.4em; color: #444; font-family: &quot;Helvetica Neue&quot;, Helvetica, Arial, sans-serif; margin: 0 0 1em;">Check their promo cards to learn about their special hosting offer with extra discount for WordCamp Utrecht.</p>
<div style="direction: ltr; clear: both"></div>                                                                                                                                 </div>

                                                                                                                                    <div class="meta"  style="direction: ltr; color: #999; font-size: .9em; margin-top: 4px; line-height: 160%; padding: 15px 0 15px; border-top: 1px solid #eee; border-bottom: 1px solid #eee; overflow: hidden">
                                                                        <strong><a style="text-decoration: underline; color: #2585B2;"  href="https://2017.utrecht.wordcamp.org/?author=9781384">Berry van Es</a></strong> | november 16, 2017 om 10:53 am | Categorieën:<a style="text-decoration: underline; color: #2585B2;"  href="https://2017.utrecht.wordcamp.org/?taxonomy=category&amp;term=sponsorblogs">Sponsorblogs</a>
 | URL: <a style="text-decoration: underline; color: #2585B2;"  href="https://wp.me/p8G0B4-yT">https://wp.me/p8G0B4-yT</a>                                                                  </div>

                                                                <p class="subscribe-action-links"  style="direction: ltr; font-size: 14px; line-height: 1.4em; color: #444; font-family: &quot;Helvetica Neue&quot;, Helvetica, Arial, sans-serif; margin: 0 0 1em; font-size: 14px; color: #666; padding: 0; width: auto; padding-top: 1em; padding-bottom: 0em; margin-bottom: 0; margin-left: 0; padding-left: 0;">
                                                                    <table class="auto-width" border="0" cellspacing="0" cellpadding="0" style="width: 100%; width: auto">
                                                                        <tr>
                                                                            <td style="width: 10px;"><a href="https://2017.utrecht.wordcamp.org/sponsorblog-siteground/#respond" style="text-decoration: underline; color: #2585B2; -moz-border-radius: 10em; -webkit-border-radius: 10em; border-radius: 10em; border: 1px solid #11729E; text-decoration: none; color: #fff; text-shadow: 0 1px 0 #11729E; background-color: #2585B2; padding: 5px 15px; font-size: 16px; line-height: 1.4em; font-family: Helvetica Neue, Helvetica, Arial, sans-serif; font-weight: normal; margin-left: 0; white-space: nowrap;">Reageren</a></td>
                                                                                                                                                    <td>&nbsp;&nbsp;&nbsp;<a class="subscribe-action-link" href="https://2017.utrecht.wordcamp.org/sponsorblog-siteground/#comments" style="text-decoration: underline; color: #2585B2; text-decoration: underline">Bekijk alle reacties</a></td>
                                                                                                                                                                                                                        </tr>
                                                                    </table>
                                                                </p>
                                                                                                                        </td>
                                                        </tr>
                                                    </table>
                                                </td>
                                            </tr>
                                        </table>

                                        <table border="0" cellspacing="0" cellpadding="20" bgcolor="#efefef"  class="subscribe-wrapper-sub" style="width: 100%; background-color: #efefef; text-align: left; border-top: 1px solid #ddd;">
                                            <tr>
                                                <td class="subscribe-content"  style="border-top: 1px solid #f3f3f3; color: #888; font-family: &quot;Helvetica Neue&quot;, Helvetica, Arial, sans-serif; font-size: 14px; background: #efefef;">
                                                    <p style="direction: ltr; font-size: 14px; line-height: 1.4em; color: #444; font-family: &quot;Helvetica Neue&quot;, Helvetica, Arial, sans-serif; margin: 0 0 1em; font-size: 12px; line-height: 1.4em; margin: 0px 0px 10px 0px;">
                                                        <a style="text-decoration: underline; color: #2585B2;"  href="https://subscribe.wordpress.com/?key=577086142d17bef49a587f68117e6220&amp;email=info%40xxxxxxxxxxxxxxx.com&amp;locale=nl&amp;b=LadR%254CH%25%2C%5BEI9tE2yNX6cDcep%2C%3F%3DI%2BsSFUhmVBLRrScYjqINyi">Abonnement opzeggen</a> om geen berichten meer te ontvangen van WordCamp Utrecht.<br/>
                                                        Verander je e-mail instellingen in <a style="text-decoration: underline; color: #2585B2;"  href="https://subscribe.wordpress.com/?key=577086142d17bef49a587f68117e6220&amp;email=info%40xxxxxxxxxxxxxxx.com&amp;locale=nl">Beheer abonnementen</a>.                                                 </p>

                                                    <p style="direction: ltr; font-size: 14px; line-height: 1.4em; color: #444; font-family: &quot;Helvetica Neue&quot;, Helvetica, Arial, sans-serif; margin: 0 0 1em; font-size: 12px; line-height: 1.4em; margin: 0px 0px 0px 0px;">
                                                        <strong>Problemen met klikken?</strong> Kopieer en plak deze URL in je browser: <br />
                                                        <a style="text-decoration: underline; color: #2585B2;"  href="https://2017.utrecht.wordcamp.org/sponsorblog-siteground/">https://2017.utrecht.wordcamp.org/sponsorblog-siteground/</a>
                                                    </p>
                                                </td>
                                            </tr>
                                        </table>
                                    </td>
                                </tr>
                            </table>

                            <table border="0" cellspacing="0" cellpadding="0" height="3" background="https://s0.wp.com/i/emails/stripes.gif"  class="subscribe-footer-wrap" style="width: 100%; background-image: url(https://s0.wp.com/i/emails/stripes.gif); background-repeat: repeat-x; background-color: #43A4D0; height: 3px;">
                                <tr>
                                    <td></td>
                                </tr>
                            </table>
                        </div>
                    </td>
                </tr>
            </table>

            <br />
        </td>
    </tr>
</table>

<img alt="" border="0" src="http://pixel.wp.com/b.gif?blog=128222762&#038;post=2163&#038;subd=2017.utrecht.wordcamp.org&#038;ref=&#038;email=1&#038;email_o=jetpack&#038;host=jetpack.wordpress.com" width="1" height="1" /></body></html>

#### Message body end
jeherve commented 6 years ago

@gravityrail Do you think you could take a look, since you worked on that part of subscriptions not that long ago in #8194?

Thanks!

gravityrail commented 6 years ago

I'm looking into this.

gravityrail commented 6 years ago

I noticed a couple of weird things.

One, the pixel.wp.com URL is encoding &s as &#038; instead of &amp; - that just seems odd (and unnecessary).

Second, the Gravatar URL includes a URL-encoded URL inside it:

http://1.gravatar.com/avatar/1edc5da2c8521919811ad60ca238cd16?s=3D50&#038;d=3Dhttp%3A%2F%2F1.gravatar.com%2Favatar%2Fad516503a11cd5ca435acc9bb6523536%3Fs%3D50&#038;r=3DG

However neither of these trips the regex as listed in the SpamAssassin 3.4 cf file: https://apache.googlesource.com/spamassassin/+/3.4/rules/20_uri_tests.cf#56

My suspicion is that this is the gravatar URL tripping the rule, maybe a customized version of the rule, or maybe there's something wrong with the rule itself.

We could try to fix this by being less aggressive in our escaping, perhaps testing content with this rule before sending it out and logging in IRC to start with.

stale[bot] commented 6 years ago

This issue has been marked as stale. This happened because:

No further action is needed. But it's worth checking if this ticket has clear reproduction steps and it is still reproducible. Feel free to close this issue if you think it's not valid anymore — if you do, please add a brief explanation.