PlanBCode / hypha

1 stars 8 forks source link

Improve copy-paste handling in Wymeditor #348

Open matthijskooijman opened 3 years ago

matthijskooijman commented 3 years ago

Since #345, all pastes into the wymeditor are stripped from their formatting automatically, to prevent garbage when pasting from e.g. Word.

However, this makes all pastes text-only and also prevents copy-pasting within the same editor, or between editors. This should be solved. Two ideas:

  1. When copying content from inside the wymeditor (or maybe more generally), mark it as "is clean HTML" somehow. Then, inside the paste handler, use the HTML as-is.
  2. When pasting HTML, do not strip all formatting, but allow some limited formatting (e.g. headings, bold/italic/underline, maybe lists) and heavily sanitize it. This is already what wymeditor does to some degree, which is obviously not enough to clean up the mess coming from Word or elsewhere, but when pasting this cleaning can probably be more aggressive.

Number 1. is probably easiest to implement and should probably be implemented even when 2. is implemented, to allow complete copy-paste within the wymeditor and limited pasting (with aggressive cleaning) from elsewhere.

Marking copies from inside wymeditor

A "copy" event handler exists: developer.mozilla.org/en-US/docs/Web/API/Element/copy_event which can probably manually retrieve the selected content, customize it to mark it as "from inside hypha" so that the paste event handler can recognize it and skip the format-stripping.

One way I thought we could mark the data as being "from inside hypha" could be to tweak the "format" on the clipboard. The clipboard typically contains multiple versions (e.g. "text/plain" and "text/html"), and I hoped we could just add an "application/hypha-html" or something in addition to the default types, with identical content to the "text/html" version. However, reading the spec, I think that the formats that can be written from "untrusted" scripts is limited: w3c.github.io/clipboard-apis/#writing-to-clipboard.

So I guess adding some wrapper or marker inside the "text/html" version should be done instead. This marker should ideally be invisible and unobtrusive, for the case that this HTML is pasted elsewhere.

We should probably also try to generate a "text/plain" version, to allow copy-pasting from Wymeditor into a plain textarea or so.

Incomplete HTML escaping in wym.paste

A separate, but related issue exists in the wym.paste code.

This is a corner case where the text is interpreted as HTML rather than text (i.e. any HTML tags in the text content, which must have been escaped as e.g. > in the original clipboard contents, would become actual HTML again after the paste). This only occurs when inserting into the outer body tag, so when no other container is selected (which is pretty hard to achieve, except when the editor is completely empty). This also seems to occur with the existing "Paste from word" feature as well.

The problem is in https://raw.githubusercontent.com/PlanBCode/hypha/master/system/wymeditor/jquery.wymeditor.js, in the WYMeditor.editor.prototype.paste function (can't deeplink here, since the file is too big to be displayed by github). The problem is here (and also in the following else):

    if (typeof container === 'undefined' ||
            (container && container.tagName.toLowerCase() === WYMeditor.BODY)) {
        // No selection, or body selection. Paste at the end of the document

        if (isSingleLine) {
            // Easy case. Wrap the string in p tags
            paragraphs = jQuery(
                '<p>' + paragraphStrings[0] + '</p>',
                wym._doc
            ).appendTo(wym._doc.body);
        } else {

Haven't looked closely at the code yet, but this is probably a matter of escaping the text inserted into the <p> tags. Might be good to look at the wymeditor repo to see if there was a specific usecase for this code, though.