Open renvieir opened 5 years ago
We got a customer incident for the same issue (in the context of UI5).
I analysed it and think it is an example of catastrophic backtracking. The regex in the csslexer is composed of several sub-expressions, one of them for FUNCTION:
var FUNCTION = (?!url[(])' + IDENT + '[(]';
For illustration purposes, I've expanded IDENT. Then FUNCTION looks like
var FUNCTION = (?!url[(])' + '-?' + NMSTART + NMCHAR + '*' + '[(]';
The combination of the repetition of NMCHAR*
and the succeeding '('
makes this a perfect candidate for backtracking. Plus, NMSTART
and NMCHAR
both allow alternative interpretations of digits when they occur after a backslash (either as part of a UNICODE escape sequence or as an ASCII char). Together, this builds the ground for catastrophic backtracking with exponential runtime.
Luckily, Regex: Emulate Atomic Grouping (and Possessive Quantifiers) with LookAhead describes a solution to this known issue of regular expressions.
Applying the proposed pattern to the FUNCTION sub-expression seems to fix the performance problem:
var FUNCTION = (?!url[(])' + '(?=(' + IDENT + '))\\1' + '[(]';
Further testing is needed (e.g. whether the mandatory capturing group causes negative side-effects), and cross-browser support is a topic. The lookahead ?=
as well as the back reference \1
might not be supported everywhere.
The lexCss takes too much time to perform a regex match an thus is blocking the UI.
The code snippet above is from line 239 applied to a
font-family
property.One single execution can take ~10 seconds