Closed 43081j closed 2 years ago
Still working on this FYI.
its a really awkward problem with many edge cases, etc. so just trying to come up with a simple solution, trying to avoid an overcomplicated unreliable one.
it turns out sometimes normalisation in the parser can result in less code, and sometimes more code.
e.g.
<a><b attr="></b></a>
<!-- becomes -->
<a></a>
so it actually became shorter than the input source.
meanwhile:
<a attr></a>
<!-- becomes -->
<a attr=""></a>
so it became longer than the input source. this one is actually a current bug i only noticed now, as if we ever report the position of the quotes it'll be wrong.
turns out im an idiot and went down a rabbit hole for no reason.
the positions are not off. parse5 already maps the locations of the normalised document back to the location in the original source.
will update the tests to assert around this stuff but its already fine as is in the end
As a follow-up to #117 , we need to double check that location resolution works fine for documents.
The reason it may be off is because parsing a document also normalises it by adding missing tags (head, body, etc).
A couple of new tests should be enough to check this.