isaacs / sax-js

A sax style parser for JS
Other
1.09k stars 325 forks source link

BUG: Non-strict entity detection misses close tag #249

Open solter opened 2 years ago

solter commented 2 years ago

Problem

When an xml file has data that looks like

<root>
  <a>content&x</a>
  <b/>
</root>

and the strict mode is off, then the parser ends up calling ontext event with the value of content&x</a>\n.

Root cause

This is caused by the assumption that when an illegal entity character is parsed it should remain part of the text for the tag, which occurs at lib/sax.js line 1492.

Solution outline

To fix this would probably require changing the code around line 1492 to something like the following

i-- // reprocess the illegal character
parser[buffer] += '&' + parser.entity
parser.entity = ''
parser.state = returnState