fedwiki / wiki-client

Federated wiki client-side javascript as a npm module.
Other
117 stars 38 forks source link

The Paragraph plugin should escape html tags and elements. #46

Closed WardCunningham closed 9 years ago

WardCunningham commented 10 years ago

We've ignored the danger of cross-site scripting (XSS) for way too long. Paragraph must escape html tags and elements before it renders its text into the dom. We've created a sanitized html plugin for most of the cases where we want html.

https://github.com/fedwiki/wiki-plugin-html

It might be reasonable to write a script that looks for html in paragraphs and converts them to the new HTML plugin. Perhaps a < in the first character position is enough of a clue.

The HTML plugin doesn't allow YouTube embed codes so that is one other thing we have to consider. One approach would be to whitelist YouTube and Vimeo embed codes. Another approach would be to make plugins for these formats. It would be nice to drag videos onto a Factory and have the right thing happen.

paul90 commented 10 years ago

Having a plugin for video would be a good idea.

Rather than create a new content type for HTML, wouldn't this better be built into the paragraph behaviour? Or indeed anywhere markup can appear, image/map captions...

WardCunningham commented 10 years ago

It is an accident of my own laziness that html ever worked in paragraphs. I started using it even when I knew better because I had to make some pages with headings. I suppose it is rather late in the project to say that html tags don't belong except sequestered in their own item type. I'd like to make that assertion. I'm willing to work hard on page conversion utilities if that is my penance for waiting so long.

Here is the guiding principle: When someone puts up a server and starts investing in content, they need some assurance that forking content from the federation won't put their own work at risk. Said another way, the json has to be inert. If I don't trust caja to sanitize json, I can pull the one plugin that uses it. If that screws up an occasional heading, I can live with that.

paul90 commented 10 years ago

Would it be an idea to add a header content type? I know it was mentioned a long time ago...

WardCunningham commented 10 years ago

Do you mean an additional property that describes how the text property is encoded? Both properties of items?

almereyda commented 10 years ago

Would you want more opinions on i.e. Headers, XSS, ETag and SPDY from externals (remoteStorage/Unhosted)?

paul90 commented 10 years ago

@WardCunningham that would be an idea, the old issue I saw was suggesting adding a new type of header to complement paragraph, but that probably just adds complexity unless they serve some other value.

Adding details about how the paragraph content should be presented is interesting.

@almereyda no, the other type of header...

...

WardCunningham commented 10 years ago

Sanitizing has been in "wants and needs" forever. http://ward.fed.wiki.org/sanitize-html.html

WardCunningham commented 10 years ago

I'd like to deprecate html in paragraphs. My motivation is two fold, safety and simplicity.

I'm ok with coding up a compatibility mode that isn't simple so long as we are taking steps toward eventual simplicity. By simplicity I mean pushing all but simple and documented regex substitutions out of core plugins like paragraph, image, reference and factory.

Safety could be achieved now with little disruption by just adding sanitation to resolveLinks in lib/resolve. This, in my opinion, condemns all text in paragraphs to be uninterpretable by any program simpler than a full html parser. If we can't think of anything better, we need to do at least this.

I'll look into writing a Video plugin that handles common embed tags and maybe hack drag-and-drop to recognize youtube and vimeo link formats.

WardCunningham commented 10 years ago

As promised, I've created a Video plugin. I'll let @paul90 suggest the best process for creating and publishing to an npm module and then including that in the dependents of package wiki.

svdoever commented 9 years ago

Is there consesus on usage of html tags in the paragraph factory, and which html tags to use? My biggest problem is that in a paragraph the enter splits the paragraph (as it should), so everything should be in a sngle line. This dismisses usage of html tags like

  • . If html tags are used it seems best be restricted to "inline" tags only, like strong, b, i, code etc.

  • WardCunningham commented 9 years ago

    We created the HTML plugin to pass sanitized tags. Fork should never distribute unsanitized tags.

    That Paragraph plugin does not yet properly escape angle bracket and ampersand is a serious bug that we have been exploiting way too long. We have no right even talking about security as long as this gaping hole is left open. I'm as guilty as anyone, I admit. This has got to stop. Today could be the day.

    WardCunningham commented 9 years ago

    This issue has been resolved by pull request #71.