ndmitchell / tagsoup

Haskell library for parsing and extracting information from (possibly malformed) HTML/XML documents
Other
231 stars 37 forks source link

Incorrect source position before bogus comment #71

Closed link2xt closed 6 years ago

link2xt commented 6 years ago

Bug #70 is fixed, but the same bug with bogus comments is not:

> parseTagsOptions Text.HTML.TagSoup.parseOptions{ optTagWarning = True, optTagPosition = True } "<div><!--foo-->bar</div>"
[TagPosition 1 1,TagOpen "div" [],TagPosition 1 6,TagComment "foo",TagPosition 1 16,TagText "bar",TagPosition 1 19,TagClose "div"]
> parseTagsOptions Text.HTML.TagSoup.parseOptions{ optTagWarning = True, optTagPosition = True } "<div><?foo</div>"
[TagPosition 1 1,TagOpen "div" [],TagPosition 1 8,TagOpen "?foo<" [("div","")],TagPosition 1 13,TagWarning "Unexpected \"/\"",TagPosition 1 17,TagWarning "Expected \"?>\""]

Note the TagPosition 1 8 in the second example.

ndmitchell commented 6 years ago

The second one now gives:

[TagPosition 1 1,TagOpen "div" [],TagPosition 1 6,TagOpen "?foo<" [("div","")],TagPosition 1 13,TagWarning "Unexpected \"/\"",TagPosition 1 17,TagWarning "Expected \"?>\""]

Which is what I believe you were hoping for.

ndmitchell commented 6 years ago

I've released 0.14.4 with the fix included. Let me know if you find any more issues.