RomanSixty / Feed-on-Feeds

FeedOnFeeds is a lightweight server-based RSS feed aggregator and reader
http://feedonfeeds.com/
GNU General Public License v2.0
60 stars 15 forks source link

FoF is double-decoding entities where it shouldn't be #36

Closed fluffy-critter closed 5 years ago

fluffy-critter commented 5 years ago

Example item causing problems: https://beesbuzz.biz/blog/6128-Federated-identity-with-Atom-and-WebSub

In the feed, all of the content is included in a CDATA segment, and its entities are then further encoded correctly, e.g.:

<h3>Favored approach: In-plain-sight encryption</h3>
<p>My favored approach to providing private content on feeds is to have private items be encrypted.</p><p>Every reader has a public and private key-pair; the publisher knows their public key.</p><p>Every protected entry has a randomly-generated symmetric nonce key, and the private content (<code>&lt;title&gt;</code>, <code>&lt;content&gt;</code>, enclosure links, etc.) are stored in the encrypted payload. (<code>&lt;id&gt;</code> probably needs to remain public for various reasons, and things like <code>&lt;published&gt;</code>/<code>&lt;updated&gt;</code>/<code>&lt;link rel=&quot;alternate&quot;&gt;</code> probably should as well.) The public payload can also include something like:</p><div class="highlight"><pre><span></span><span class="nt">&lt;title&gt;</span>Private content<span class="nt">&lt;/title&gt;</span>
<span class="nt">&lt;content</span> <span class="na">type=</span><span class="s">&quot;text/html&quot;</span><span class="nt">&gt;</span>This is a private entry. Check the original site to see if you have access, or use a feed reader which supports the [insert clever name here] protocol.<span class="nt">&lt;/content&gt;</span>
</pre></div>
<p>This nonce key is then added to the item, encrypted using every trusted reader&rsquo;s public key. (So, if there are 10 followers who are allowed to see the entry, there are 10 copies of the nonce key, each one encrypted by the public key.) Of course the CMS can manage this in any number of ways (e.g. having one or more protected groups of friends who can see things, with specific per-user inclusions and exclusions).</p><p>When a reader gets an encrypted entry, it tries to decrypt each of the encrypted nonce keys with its private key, and then when it gets a valid nonce key it uses that to decrypt the payload.</p><p>The plus sides to this approach:</p>

Unfortunately, something is decoding those &lt;s et al, and so FeedOnFeeds (or maybe it's SimplePie) renders it as HTML, thus horking the layout of the rest of the entry:

<h3>Favored approach: In-plain-sight encryption</h3>
<p>My favored approach to providing private content on feeds is to have private items be encrypted.</p>
<p>Every reader has a public and private key-pair; the publisher knows their public key.</p>
<p>Every protected entry has a randomly-generated symmetric nonce key, and the private content (<code><title></code>, <code><content></code>, enclosure links, etc.) are stored in the encrypted payload. (<code><id></code> probably needs to remain public for various reasons, and things like <code><published></code>/<code><updated></code>/<code><link rel="alternate"></code> probably should as well.) The public payload can also include something like:</p>
<div class="highlight"><pre><span></span><span class="nt"><title></span>Private content<span class="nt"></title></span>
<span class="nt"><content</span> <span class="na">type=</span><span class="s">"text/html"</span><span class="nt">></span>This is a private entry. Check the original site to see if you have access, or use a feed reader which supports the [insert clever name here] protocol.<span class="nt"></content></span>
</pre></div>
<p>This nonce key is then added to the item, encrypted using every trusted reader’s public key. (So, if there are 10 followers who are allowed to see the entry, there are 10 copies of the nonce key, each one encrypted by the public key.) Of course the CMS can manage this in any number of ways (e.g. having one or more protected groups of friends who can see things, with specific per-user inclusions and exclusions).</p>
<p>When a reader gets an encrypted entry, it tries to decrypt each of the encrypted nonce keys with its private key, and then when it gets a valid nonce key it uses that to decrypt the payload.</p>
<p>The plus sides to this approach:</p>
screen shot 2018-11-29 at 1 33 39 pm

(Also it seems that class attributes aren't getting filtered out either, I could have sworn the existing sanitization was supposed to do that...)

fluffy-critter commented 5 years ago

It looks like it's the result of change 06b99a9a7a3271e9c81c2b5e29fba4443a8facfb. @RomanSixty what was the rationale for that change? I believe SimplePie already decodes things as correctly as is possible to do, and any feed that's putting on an extra layer of entity encoding is just broken and unfixably so...

In any case I have made a change on my fork at https://github.com/fluffy-critter/Feed-on-Feeds/commit/e87e2fe6ea26e7cec39c03d6974866af8b39d765 which seems to work correctly on properly-formatted feeds. I'll submit a PR after I've tested it further to see what feeds I follow break.