fastmail / Squire

The rich text editor for arbitrary HTML.
MIT License
4.78k stars 406 forks source link

Block vs Inline level elements and br tags #409

Open gwing33 opened 3 years ago

gwing33 commented 3 years ago

We've been using your editor for some time now. About a year ago, I even compared it to other HTML editors, and in terms of ease of use, consistency and maintainability, it still seemed to be better than the competition. So this issue/discussion below really pales in comparison to how well this has worked over the years.

Block vs Inline level elements and br tags

One issue we've had for quite some time though, has been with sometimes Squire adds spaces when pasting in HTML. I have looked through most of this repo's issues, but in case I missed a duplicate question feel free to let me know.

Our customer issue:

They often want fancy emails, so they copy/paste in HTML into the editor. At times, their HTML is setup so that <img /> or <span> tags (or other normally inline level elements) have display:block styles on them. I get it, email HTML is hard. Old legacy vs new shiny.

Here is an example boiled down:

<span style="display:block;">Hello</span>
<div>world</div>

becomes

<div>
    <span style="display:block;">Hello</span>
    <br>
</div>
<div>world<br></div>

As you'd expect, there is an empty line between between Hello and world. Really, world should just be on the next line down.

Attempt at Patching

In particular, the getNodeCategory function here is not checking to see if it has a style="display: block" or something similar. I did quickly test this out:

function getNodeCategory ( node ) {
    if (node && node.style && node.style.display) {
        return node.style.display.includes('inline') ? INLINE : BLOCK;
    }
    // ...
}

The immediate issue with this is with <img />, so I wasn't super impressed by this initial test. Not to mention this would change the entire behavior for anything that uses this function. However, it did seem to help in some scenarios, so maybe it's worth further discussion.

Changing our usage

Rather than attempting to change how Squire works, maybe the better solution would be to change how we are using it?

gwing33 commented 3 years ago

Just a quick update. I monkey patched insertHTML and setHTML to add our own parser that strips out display style tags from inline elements. This seems to resolve 99% of the issues. It seems like when there is an issue, it's something like an empty div or td which is a bit easier to instruct people how to improve their emails.

Edit: If this would be valuable for others, I could look at doing a formal PR for this.

the-djmaze commented 3 years ago

If you dig deeper, people should NOT paste HTML. There is way too much CSS and HTML5 that mail clients don't understand. Especially the horrible Outlook, which is just Word that tries HTML but fails. So fancy e-mail is broken (your customer won't see that).

gwing33 commented 3 years ago

Yeah I understand that, but teach that to customers who want to add fancy email footers. I'd like to find a better solution for stripping/cleaning pasted HTML, but right now the best thing we've found is to put it into google docs and copy it from there. I haven't looked too hard at any 3rd party cleansing libraries just yet.

the-djmaze commented 3 years ago

You could start with:

HTML = HTML
    // Remove Microsoft Office styling
    .replace(/(<[^>]+[;"'])\s*mso-[a-z-]+\s*:[^;"']+/gi, '$1')
    // Remove hubspot data-hs- attributes
    .replace(/(<[^>]+)\s+data-hs-[a-z-]+=("[^"]+"|'[^']+')/gi, '$1');

or

// Drop Microsoft Office style properties
const rgbRE = /rgb\((\d+),\s*(\d+),\s*(\d+)\)/g,
    hex = n => ('0' + parseInt(n).toString(16)).slice(-2);
body.querySelectorAll('[style*=mso]').forEach(el =>
    el.setAttribute('style', el.style.cssText.replace(rgbRE, (m,r,g,b) => '#' + hex(r) + hex(g) + hex(b)))
);