w3c / websub

WebSub Spec in Social Web Working Group
https://w3c.github.io/websub/
285 stars 50 forks source link

At risk: limiting the use of HTML <link> to the HTML <head> #146

Closed aaronpk closed 6 years ago

aaronpk commented 6 years ago

The following features are at risk; if interoperable implementations are not found, they may be removed to advance the other features in this specification to Recommendation:

  • The restriction of <link> tags limited to the <head> of the HTML document

This issue is to track the progress of this at-risk feature.

aaronpk commented 6 years ago

In today's call we resolved to investigate whether any of the subscriber implementations have coded this limitation, and drop the feature if there are no known implementations.

@mblaney @marten-de-vries and @julien51, you three have said that your WebSub subscribers support discovery for HTML documents. Can you please report here whether your implementations look for the <link> tag anywhere in the HTML document or whether they look for the tag only in the <head> section?

julien51 commented 6 years ago

For Superfeedr, only in the <head> section.

sandhawke commented 6 years ago

@julien51 I think markup messed up your answer. Should be "For Superfeedr, only in the <head> section", I think?

So, how would you feel about saying that Superfeedr is broken, and people can put the <link> element anywhere in the document?

mblaney commented 6 years ago

dobrado uses the first <link>'s found, anywhere in the document.

marten-de-vries commented 6 years ago

Idem for Flask-WebSub: first <link> tag anywhere in the document. I never implemented the restriction-to-head part as it was already marked as at risk by then and this was easier to implement.

nightpool commented 6 years ago

robustness principle is probably the correct direction to go here. In theory, <link> is defined by HTML5 to only be allowed in places where Metadata Content is expected—as far as I can tell, this is just the <head> section. WHATWG allows it in the body, but only for a restricted set of types—none of which apply here. So you should only put links in the head section, but you should look for them anywhere.

tantek commented 6 years ago

Wait I thought we were looking for what subscriber implementations did (to discover the hub), and Superfeedr is a hub right? Is it also a subscriber?

aaronpk commented 6 years ago

Yes, Superfeedr will also subscribe to other hubs to get updates about feeds. Here is the implementation report for the subscriber side of Superfeedr https://github.com/w3c/websub/blob/master/implementation-reports/SUBSCRIBER-superfeedr.com.md

aaronpk commented 6 years ago

On today's call, the group resolved the following:

Drop at-risk limitation of <link> discovery restricted to <head>, and add a security consideration saying that user-generated content on pages advertising a hub should be sanitized to remove <link> tags. Replace the at-risk sentence with a "note" that since <link> has been limited to <head> for many years, consuming code might only check <head> so it is more robust to place <link> tags in the <head>

We still need to write the security consideration section, so we can continue that discussion in this thread.

dissolve commented 6 years ago

It may be worth mentioning that this is a potential security issue introduced from an HTML change, and will exist in other specs.

Even when the change was made to HTML, browsers at that time allowed link in the body already (according to commit message) https://github.com/whatwg/html/commit/179983e9eb99efe417349a40ebb664bd11668ddd

sandhawke commented 6 years ago

Yeah, sites which allows folks to post link tags unchecked are vulnerable to css madness, and RSS impersonation, so this is a pretty small issue beyond that. But, still worth mentioning, it seems to me.

Maybe:

The decision about whether a subscriber should look for <link> elements inside a page's <body> (as well as the <head>) is not straightforward, and there is currently no clear consensus. One reason to ignore the <body> during discovery is that some web sites (perhaps accidentally) allow users to post content containing <link> elements. If websub discovery uses such <link> elements, one user could maliciously cause all subscribers to use an alternate hub which later delivers malicious content. Given this potential attack, it may be prudent to do discovery only in the <head> of HTML documents.

aaronpk commented 6 years ago

I like it. I've added this paragraph to the security considerations: https://w3c.github.io/websub/#discovery-0

How do others feel about this?

marten-de-vries commented 6 years ago

I think it would be better to focus the paragraph on publishers than subscribers. They are the ones at risk, and they are the only ones that can mitigate the issue completely.

In the current situation, as a subscriber implementer, I'm unsure what to do. Restrict discovery to <head>? Probably, in line with the advice. But what if a user comes along whose CMS only allows inserting custom code at the end of the <body> tag? As it works with other (legacy) subscribers, and it's not against the spec, I might decide that I should support that use case. So unless the spec makes this mandatory (which is counter to the point of this issue), subscribers cannot solve this problem.

aaronpk commented 6 years ago

@marten-de-vries yeah a subscriber can definitely still look outside the <head>, and supporting those kinds of CMSs is one reason to do that.

On today's call the group resolved to keep the existing text as is.

dissolve commented 6 years ago

The risk isn't just to publishers who would be reading the spec though.

The discussion brought up the point that the security issue is the same for anyone who isn't even away of the websub spec. If you allow <link> tags to be inserted from comments, then its possible for a commenter to set a hub for your page which can be a bad actor. On the same token, they could also insert custom CSS and a host of other things. I don't think telling publishers about basic vulnerabilities of html is really something that should be tackled in websub. At some point it would be like saying 'in theory, people could insert <script> tags in to a page, so you should be careful of that'.

In an ideal world, I would like to see a requirement that you check for the tag anywhere in the page. That would be best for interop. But one of the main goals of the websub spec was to say that any existing pubsubhubbub 0.4 implementations are already fully compliant websub implementations. It also better captures what is done in practice right now, which is that many still only check the head since until 2016, <link> was only allowed inside of <head>.

This is basically why its stated that there is no consensus. And its likely that if there is ever a websub 2 or any spec based on it. It will be able to make checking for <link> anywhere in the page, mandatory.

marten-de-vries commented 6 years ago

Good point about warning publishers essentially being out of scope for the spec. Based on that, I have to agree that the current text is probably the best compromise. It at least highlights the issue.

And yes, I agree that the ideal solution would be a hard requirement, but that ship has sailed.