google / gumbo-parser

An HTML5 parsing library in pure C99
Apache License 2.0
5.16k stars 660 forks source link

Wrong parsing of web components (ie polymer) tags #402

Closed essenciary closed 6 years ago

essenciary commented 6 years ago

I'm using gumbo-parser from Julia and when attempting to parse web components tags, this happens:

julia> Gumbo.parsehtml("""<px-spinner size="100"></px-spinner>""")
julia> HTML Document:
<!DOCTYPE >
<HTML>
  <head></head>
  <body>
    <px-spinner size="100" size="100"></px-spinner size="100">
  </body>
</HTML>
essenciary commented 6 years ago

I understand the issue might be caused by the Julia wrapper. The maintainer can't check so I'll take a look -- I'll report back my findings.

craigbarnes commented 6 years ago

It's almost certainly the Julia wrapper. Using the lua-gumbo tree builder and serializer, I get:

<html>
    <head></head>
    <body>
        <px-spinner size="100"></px-spinner>
    </body>
</html>
essenciary commented 6 years ago

Thanks, indeed :) This should be closed though -- I've made a PR with the fixes months ago and was already merged.