miketaylr / planet.webcompat.com

Webcompat Planet's guts.
5 stars 6 forks source link

[sec-critical] Escape HTML special characters #16

Open fbender opened 9 years ago

fbender commented 9 years ago

Currently, HTML code in blog posts is not escaped but rendered. This could lead to XSS etc., but also makes the planet harder to read. Please escape the following characters: <, >, &, ", '

Issue is currently visible at Jonathan's Intent to Implement on July 29th: planet-webcompat-xss

miketaylr commented 9 years ago

Yikes, thanks @fbender. Will fix today.

miketaylr commented 9 years ago

Actually, I don't know how to escape HTML rendered via XSLT. @karlcow, this is more your area of expertise.... any suggestions?

Seems like we want to escape all the instances of <xsl:value-of select="atom:title"/>, <xsl:apply-templates select="atom:content"/> and <xsl:apply-templates select="atom:summary"/>?

karlcow commented 9 years ago

@fbender was it visible in your feed reader or in the planet.webcompat.com home page? Given the screenshot probably in the HTML. https://github.com/webcompat/planet.webcompat.com/blob/master/theme/index.html.xslt

So the intent emails are coming from Yahoo! Pipes which will disappear soon.

Thank you for using Yahoo Pipes! To help focus our efforts on core Yahoo product experiences, users will no longer be able to create new Pipes starting August 30th 2015. The service will be put in read-only mode until we will discontinue Yahoo Pipes on September 30th 2015. You can find more details here.

  • Is the input (feed) having an unescaped character because of Yahoo! Pipes?
  • OR do we convert already escaped feeds back to an unescaped character?

The thing which is a bit strange is I thought the sanitization was handled on the python code side of venus and not the xslt templates. Hmmm needs to investigate.

Original message is at https://groups.google.com/forum/#!searchin/mozilla.dev.platform/intent$20to$20implement$20directory$20/mozilla.dev.platform/Q3BLd4Cwj6Q/JwmBVf7zIgAJ

@miketaylr could you check what Yahoo! Pipes is sending us?

karlcow commented 9 years ago

ok making progress. This is what the pipes xml feed looks like in my reader. So the input is escaped. And somehow we have converted it again to html when copying the description. This is strange.

      <item>
         <title>Intent to implement: directory picking and directory drag and drop</title>
         <link>https://groups.google.com/d/msg/mozilla.dev.platform/Q3BLd4Cwj6Q/JwmBVf7zIgAJ</link>
         <description>MS has a proposal for a minimal set of functionality to support 
directory picking via &lt;input&gt; and to support directory drag and drop. 
https://microsoftedge.github.io/directory-upload/proposal.html 
This spec is a work in progress but we now have an implementation for platforms that have a</description>
         <author>Jonathan Watt</author>
         <guid isPermaLink="false">https://groups.google.com/d/topic/mozilla.dev.platform/Q3BLd4Cwj6Q</guid>
         <pubDate>Wed, 29 Jul 2015 15:37:14 +0000</pubDate>
      </item>

The original template for index.html.xslt https://github.com/rubys/venus/blob/master/themes/asf/index.html.xslt#L302

Ours https://github.com/webcompat/planet.webcompat.com/blob/master/theme/index.html.xslt#L259

Pretty similar. Happening before?

karlcow commented 9 years ago

(btw I thought this repo was transferred to github/webcompat/ ^_^ )

miketaylr commented 9 years ago

(btw I thought this repo was transferred to github/webcompat/ ^_^ )

Yeah, the main repo is over there -- but I had forgotten to enable issues. So we can discuss here and make fixes there.

miketaylr commented 9 years ago

@fbender, just to verify -- are you consuming planet.webcompat.com through a web browser? Or its feed in a feed-reader?

fbender commented 9 years ago

Browser (Firefox)!

karlcow commented 9 years ago

The issue exists in atom.xml.xslt and index.xml.xslt

I spent a big chunk of time today on this without getting any fruitful results. :dragon_face:

My impression is that it is not at the XSLT level (the default xslt there in the venus project output the exact same thing) but really at the level of python in the original venus code.

The frustrating bits come from this. Let's imagine

         <description>MS has a proposal for a minimal set
            of functionality to support directory picking
            via &lt;input&gt; and &lt; to support directory drag and drop.
            https://microsoftedge.github.io/directory-upload/proposal.html
            This spec is a work in progress but we now have an
            implementation for platforms that have a</description>

This is converted as

    <summary type="xhtml"><div xmlns="http://www.w3.org/1999/xhtml">MS has a proposal for a minimal set
            of functionality to support directory picking
            via <input/> and &lt; to support directory drag and drop.
            https://microsoftedge.github.io/directory-upload/proposal.html
            This spec is a work in progress but we now have an
            implementation for platforms that have a</div>
    </summary>

and if had used:

            via &lt;input&gt; and &lt; and &lt;boo&gt; to support directory drag and drop.

It is converted as:

            via <input/> and &lt; and  to support directory drag and drop.

So the issue is somewhere in the sanitization process in python (venus or feedparser).

Btw using the atom.xml.xslt from the common folder in the original venus project leads to the same result.

fbender commented 9 years ago

Can you do a post-processing which encodes the HTML special characters then reverses the process for double-encoding like &amp;lt;?

It seems that venus is no longer maintained. I posted a related issue some time ago, still no response … rubys/venus#24

karlcow commented 9 years ago

Yes it's possible to certainly fix it in the venus python code. I could dive into that. We will have to be careful in case of merge, but many the risk is minimal given that venus seems not maintained.