logXmlEntity in log syntax has high cpu utilization on long hex values #510

dmlary commented 4 years ago

Does this bug happen when you install plugin without vim-polyglot? No, it's the vim-polyglot syntax files.

Describe the bug: Working with log files that contain json-encoded data, some of it rather long. When syntax highlighting is enabled, the following line (burried in a long log file) causes vim to sit at 100% utilization for more than a minute on vim7, and two seconds on vim8.

# the following line has 1220 byte string for data
I, [2020-07-08T14:41:34.438343 #15]  INFO -- service: { rc: 1220, errno: 0, data: '00000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000' }

syntime report under vim8 with just that line:

  2.750524   9      0       2.750508    0.305614  logXmlEntity       \&\w\+;
  0.000460   9      0       0.000447    0.000051  logDate            \(\(Mon\|Tue\|Wed\|Thu\|Fri\|Sat\|Sun\) \)\?\(Jan\|Feb\|Mar\|Ap
  0.000221   1      0       0.000221    0.000221  logSysColumns      \w\(\w\|\.\|-\)\+ \(\w\|\.\|-\)\+\(\[\d\+\]\)\?:
  0.000160   9      0       0.000157    0.000018  logUUID            \w\{8}-\w\{4}-\w\{4}-\w\{4}-\w\{12}
  0.000134   22     13      0.000110    0.000006  logFloatNumber     \<\d.\d\+[eE]\?\>
  0.000133   27     19      0.000117    0.000005  logOperator        [;,\?\:\.\<=\>\~\/\@\&\!$\%\&\+\-\|\^(){}\*#]
  0.000131   12     3       0.000117    0.000011  logDate            \d\{2,4}[-\/]\(\d\{2}\|Jan\|Feb\|Mar\|Apr\|May\|Jun\|Jul\|Aug\|
  0.000099   15     6       0.000093    0.000007  logHexNumber       \<\d\x\+\>
  0.000097   17     8       0.000086    0.000006  logNumber          \<-\?\d\+\>
  0.000080   1      0       0.000080    0.000080  logTimeZone        \d\{4} [A-Z]\{2,5}\>
  0.000075   9      0       0.000060    0.000008  logDomain          \v(^|\s)(\w|-)+(\.(\w|-)+)+\s
  0.000071   13     4       0.000052    0.000005  logTime            \d\{2}:\d\{2}:\d\{2}\(\.\d\{2,6}\)\?\(\s\?[-+]\d\{2,4}\|Z\)\?\>
  0.000058   9      0       0.000050    0.000006  logMD5             \<[a-z0-9]\{32}\>
  0.000057   9      0       0.000054    0.000006  logIPV6            \<\x\{1,4}\(:\x\{1,4}\)\{7}\>
  0.000054   9      0       0.000049    0.000006  logFilePath        [^a-zA-Z0-9"']\@<=\/\w[^\n|,; ()'"\]{}]\+
  0.000051   9      0       0.000049    0.000006  logHexNumber       \<0[xX]\x\+\>
  0.000051   9      0       0.000050    0.000006  logBinaryNumber    \<0[bB][01]\+\>
  0.000047   9      0       0.000045    0.000005  logIPV4            \<\d\{1,3}\(\.\d\{1,3}\)\{3}\>
  0.000047   9      0       0.000043    0.000005  logMacAddress      \<\x\{2}\(:\x\{2}\)\{5}
  0.000046   9      0       0.000043    0.000005  logFilePath        \<\w:\\[^\n|,; ()'"\]{}]\+
  0.000035   13     4       0.000030    0.000003  logBrackets        [\[\]]
  0.000024   1      1       0.000024    0.000024  logString          $
  0.000015   26     17      0.000002    0.000001  logString          '\(s \|t \| \w\)\@!
  0.000008   9      0       0.000002    0.000001  IndentLineSpace    ^\s\+
  0.000006   9      0       0.000004    0.000001  logEmptyLines      -\{3,}
  0.000005   9      0       0.000004    0.000001  logEmptyLines      - -
  0.000005   9      0       0.000004    0.000001  logString          "
  0.000005   9      0       0.000004    0.000001  logXmlDoctype      <!DOCTYPE[^>]*>
  0.000005   9      0       0.000003    0.000001  logXmlComment      <!--
  0.000005   9      0       0.000005    0.000001  logXmlCData        <!\[CDATA\[.*\]\]>
  0.000004   9      0       0.000004    0.000000  logEmptyLines      \*\{3,}
  0.000004   1      1       0.000004    0.000004  logString          '
  0.000004   9      0       0.000003    0.000000  logXmlHeader       <?\(\w\|-\)\+\(\s\+\w\+\(="[^"]*"\|='[^']*'\)\?\)*?>
  0.000004   9      0       0.000003    0.000000  logXmlTag          <\/\?\(\(\w\|-\)\+:\)\?\(\w\|-\)\+\(\(\n\|\s\)\+\(\(\w\|-\)\+:\
  0.000003   9      0       0.000003    0.000000  logEmptyLines      =\{3,}
  0.000003   1      0       0.000003    0.000003  logString          \\.
  0.000003   1      0       0.000003    0.000003  logString          s
  0.000003   1      1       0.000003    0.000003  logTimeZone        [A-Z]\{2,5}\>\( \d\{4}\)\?
  0.000003   9      0       0.000003    0.000000  logUrl             http[s]\?:\/\/[^\n|,; '"]\+
  0.000002   9      0       0.000001    0.000000  logDate            ^20\d\{6}

  2.752742   377

To Reproduce:

sheerun commented 4 years ago

