Automattic / juice

Juice inlines CSS stylesheets into your HTML source.
MIT License
3.1k stars 220 forks source link

Error when encoded quotes are used in html attribute #482

Open szaleq opened 2 months ago

szaleq commented 2 months ago

I have some HTML which is a result of other processing and already contains some inline styles. I'm then using juice to inline some global CSS, as the output must be used as an email content. I'm using Google Fonts loaded by my CMS, some of them are quoted - produced CSS looks like this: font-family: "One Sans", sans-serif; which is a valid CSS. When inlined into a div style attribute, it gets encoded:

<div style="font-family:&quot;Open Sans&quot;, sans-serif;"></div>

This causes error in Juice, as it does not decode the HTML entities and instead tries to parse it as css rules. Here are the objects from juice parsing:

{
  prop: Property {
    prop: 'font-family',
    value: '&quot',
    selector: Selector {
      text: '<style>',
      spec: [Array],
      styleAttribute: true,
      tokens: [Expression]
    },
    priority: 0,
    additionalPriority: [ 1, 5 ]
  }
}
{
  prop: Property {
    prop: undefined,
    value: 'OpenSans&quot',
    selector: Selector {
      text: '<style>',
      spec: [Array],
      styleAttribute: true,
      tokens: [Expression]
    },
    priority: 0,
    additionalPriority: [ 1, 23 ]
  }
}

And it fails with the error: Cannot read properties of undefined (reading 'indexOf'). 2p84jCbJ3V

Minimal reproduction

Just create an input.html file and paste this code:

<!DOCTYPE html>
<html lang="en">
    <head>
        <style>
            div {
                background-color: #fff;
            }
        </style>
    </head>
    <body>
        <div style="font-family:&quot;Open Sans&quot;, sans-serif;"></div>
    </body>
</html>

Then run:

npm run juice input.html output.html

And see the error.

Solution

Juice should decode HTML entities after reading an attribute value and before parsing it to CSS rules. This way it could convert &quot; (and other such strings) back to the proper double-quote character (") so that it woudn't contain any semi-colons and could be properly parsed into CSS.

Glavin001 commented 1 month ago

I just experienced this bug. Is there any progress or anyone actively working on it? If not, I'd be happy to contribute a fix, however, I am new to contributing to Juice therefore I'd appreciate some guidance first. Thank you!

amedve commented 1 day ago

I use v9.1.0, and it happens for me after dependencies update, too.

Looks like the issue is in cheerio v1.0.0. Here is the related issue: https://github.com/cheeriojs/cheerio/issues/4045