kangax / html-minifier

Javascript-based HTML compressor/minifier (with Node.js support)
http://kangax.github.io/html-minifier/
MIT License
4.95k stars 571 forks source link

CVE-2022-37620/ ReDoS found in htmlminifier.js #1135

Open secdevlpr26 opened 2 years ago

secdevlpr26 commented 2 years ago

A Regular Expression Denial of Service (ReDoS) flaw was found in kangax html-minifier 4.0.0 via the candidate variable in htmlminifier.js. The ReDoS vulnerability can be mitigated with several best practices described here: [https://snyk.io/blog/redos-and-catastrophic-backtracking/]

michaeljauk commented 2 years ago

Is anyone willing to work on this?

blv-raulcatalan commented 1 year ago

Any update on this?

rquadling commented 1 year ago

One of the lines referred to be the CVE has the following regex \s+([1-9][0-9]*w|[0-9]+(?:\.[0-9]+)?x)$

Would changing it to \s+([1-9][0-9]*w|[0-9]+(?:\.[0-9]+|)x)$ be enough?

\s+([1-9][0-9]*w|[0-9]+(?=(\.[0-9]+|))\2x)$ is another way that should stop the backtracking (based this upon https://snyk.io/blog/redos-and-catastrophic-backtracking/.

timbomckay commented 1 year ago

I stumbled upon a fork of this maintained by terser: https://www.npmjs.com/package/html-minifier-terser

Probably the way to go.

littleblack111 commented 6 months ago

any updates?

timbomckay commented 6 months ago

any updates?

@littleblack111 as mentioned above, along with the slew of references to others switching, html-minifier-terser is probably the best solution. Aside from some repo adjustments this solution appears to be fairly abandoned, especially since the package hasn't been updated in 5 years, while html-minified-terser is maintained by Terser.

rquadling commented 6 months ago

Now I've just got to remember what project I'm involved in that has this one as an issue!

Hessah95 commented 6 months ago

any update on this issue?

timbomckay commented 6 months ago

any update on this issue?

@Hessah95 The comments in this thread point to another solution that's maintained.

ra-dave commented 1 month ago

html-minifier-terser latest version (7.2.0) still has the vulnerability? I just installed it and it says so.

marcoandre1 commented 1 month ago

@ra-dave I first installed version 7.2.0 as well, but it broke my unit tests. I then went for version 5.1.1 which keeps my unit tests passing and it made the vulnerability go away. Of course, this only works if you don’t need the newest version.

DanielRuf commented 3 weeks ago

Can anyone provide a working proof of concept that confirms the issue can be actively exploited?

Also what would be the definitive solution here?

DanielRuf commented 2 weeks ago

It seems the CVE record differs from https://security.snyk.io/vuln/SNYK-JS-HTMLMINIFIER-3091181, which contains more details.

Not sure why this is the case. I will clarify with Snyk.

Edit: the candidate variable is not even used here, because the relevant code branch for srcset (else if (isSrcset(attrName, tag)) {) is not even executed in the PoC.

It seems the CVE description is more than misleading, at best incomplete and inaccurate.

DanielRuf commented 2 weeks ago

It seems the PoC is more complex than needed.

The code basically makes not much sense: minify(attrName = '\t'.repeat(547703) + '.\t1x', tag = '\t'.repeat(547703) + '.\t1x' )

It should be probably this:

minify('\t'.repeat(547703) + '.\t1x', '\t'.repeat(547703) + '.\t1x' )

Because the minify function has the following signature: exports.minify = function(value, options) {

The options parameter does not have an impact. Because when I leave the second parameter undefined, then time node test.js gives the same duration.

So the following PoC code is sufficient: result = minify('\t'.repeat(54770) + '.\t1x' )

The issue is in the following code part:

  var customFragments = options.ignoreCustomFragments.map(function(re) {
    return re.source;
  });
  if (customFragments.length) {
    var reCustomIgnore = new RegExp('\\s*(?:' + customFragments.join('|') + ')+\\s*', 'g');
    // temporarily replace custom ignored fragments with unique attributes
    value = value.replace(reCustomIgnore, function(match) {
      if (!uidAttr) {
        uidAttr = uniqueId(value);
        uidPattern = new RegExp('(\\s*)' + uidAttr + '([0-9]+)' + uidAttr + '(\\s*)', 'g');
        if (options.minifyCSS) {
          options.minifyCSS = (function(fn) {
            return function(text, type) {
              text = text.replace(uidPattern, function(match, prefix, index) {
                var chunks = ignoredCustomMarkupChunks[+index];
                return chunks[1] + uidAttr + index + uidAttr + chunks[2];
              });
              var ids = [];
              new CleanCSS().minify(wrapCSS(text, type)).warnings.forEach(function(warning) {
                var match = uidPattern.exec(warning);
                if (match) {
                  var id = uidAttr + match[2] + uidAttr;
                  text = text.replace(id, ignoreCSS(id));
                  ids.push(id);
                }
              });
              text = fn(text, type);
              ids.forEach(function(id) {
                text = text.replace(ignoreCSS(id), id);
              });
              return text;
            };
          })(options.minifyCSS);
        }
        if (options.minifyJS) {
          options.minifyJS = (function(fn) {
            return function(text, type) {
              return fn(text.replace(uidPattern, function(match, prefix, index) {
                var chunks = ignoredCustomMarkupChunks[+index];
                return chunks[1] + uidAttr + index + uidAttr + chunks[2];
              }), type);
            };
          })(options.minifyJS);
        }
      }
      var token = uidAttr + ignoredCustomMarkupChunks.length + uidAttr;
      ignoredCustomMarkupChunks.push(/^(\s*)[\s\S]*?(\s*)$/.exec(match));
      return '\t' + token + '\t';
    });
  }

After checking further, the following line is the problem:

var reCustomIgnore = new RegExp('\\s*(?:' + customFragments.join('|') + ')+\\s*', 'g'); which results in the following by default: var reCustomIgnore = new RegExp('\\s*(?:<%[\\s\\S]*?%>|<\\?[\\s\\S]*?\\?>)+\\s*', 'g');

The person whou found the problem, probably used https://devina.io/redos-checker, which also uses the same type of attack string:

image

The problem usage of * and + in reCustomIgnore without any upper bounds ({0,50}, {1,50}), which leads to catastrophic backtracking.

Migitations

DanielRuf commented 2 weeks ago

For anyone asking, html-minifier-terser is also affected. So switching to it will not resolve this. You will just get less reported CVEs but you will also have a false-negative then.

sidewayss commented 2 weeks ago

And of course it affects all the other downstream packages, including these: