tree-sitter / tree-sitter-html

HTML grammar for Tree-sitter
MIT License
136 stars 72 forks source link

Using pointers (.nextSibling etc) to walk tree misses closing </script> tag #33

Closed gushogg-blake closed 1 year ago

gushogg-blake commented 3 years ago

Hi! I'm trying to efficiently iterate over nodes in a tree, starting at a given node. I was using cursors for this but I want to be able to start at a child node and go "up" the tree, which they don't support, so I started using pointers instead. With this approach I ran into the following issue:

With this code:

<!doctype html>
<html>
    <body>
        <script>

        </script>
    </body>
</html>

The pointers approach misses out the end_tag node for the closing script tag.

Here is a comparison of the list of node types generated by the two different approaches:

cursor:   fragment doctype <! doctype > text element start_tag < tag_name > text element start_tag < tag_name > text script_element start_tag < tag_name > raw_text end_tag </ tag_name > text end_tag </ tag_name > text end_tag </ tag_name > text
pointers: fragment doctype <! doctype > text element start_tag < tag_name > text element start_tag < tag_name > text script_element start_tag < tag_name > raw_text tag_name > text end_tag </ tag_name > text end_tag </ tag_name > text

As you can see the pointers version is missing end_tag </; otherwise they are identical (scroll to around the middle to see where they diverge).

Here is a test case to produce the above output with NodeJS bindings: https://gist.github.com/user896724/0da06259ee2a2802cf619ca3fe03f9fe

amaanq commented 1 year ago

I'd like to assume it's fixed now, given it's been over 2 years - but if it hasn't feel free to say so and I'll reopen