tree-sitter-grammars / tree-sitter-markdown

Markdown grammar for tree-sitter
MIT License
428 stars 56 forks source link

Throws exception if asked to parse HTML via web-tree-sitter #93

Open savetheclocktower opened 1 year ago

savetheclocktower commented 1 year ago

Describe the bug

Reproducible from the playground:

cd tree-sitter-markdown
tree-sitter build-wasm . && tree-sitter playground

Code example

<h1>foo</h1>

Expected behavior Any sane parsing of the tree would be useful here, but the specs suggest to me that the tree should render with an html_block node.

Actual behavior Errors in both Chrome and Firefox. Here's the Chrome error:

Uncaught (in promise) TypeError: Cannot read properties of undefined (reading 'apply')
    at e.<computed> (tree-sitter.js:1:10370)
    at 0016e4de:0x27a2
    at 0016e4de:0x1824
    at tree-sitter.wasm:0x24b5b
    at Parser.parse (tree-sitter.js:1:38110)
    at handleCodeChange (playground.js:115:28)
    at it (codemirror.min.js:1:17760)
    at jn (codemirror.min.js:1:65062)
    at codemirror.min.js:1:61536
    at codemirror.min.js:1:61545

Appears to happen only when the HTML is the first content on the line. If I start with a blank document, I can type < without an error, but <a is enough to cause the error, which then repeats with each keystroke.

gushogg-blake commented 1 year ago

This might be to do with https://github.com/tree-sitter/tree-sitter/issues/949 - this grammar must be statically linked to tree-sitter as it uses C/C++ functions that aren't in the main exports.json. More info here - https://edita.vercel.app/blog/tree-sitter-howto/.

savetheclocktower commented 1 year ago

Yeah, I should try this again in Pulsar now that we've added some code to detect these missing exports.

savetheclocktower commented 1 year ago

@gushogg-blake, I read your blog post. Regarding this section…

There are two other solutions suggested in the emscripten docs:

  • add EMCC_FORCE_STDLIBS=1 and -s EXPORT_ALL=1 to the emcc command in tree-sitter
  • add missing symbols to exports.json (as described in #949)

but neither of these worked.

…I can at least confirm that Pulsar was able to get the second solution to work; the README here has more information. Maybe you'll find it useful for your editor.

gushogg-blake commented 1 year ago

@savetheclocktower Nice, thanks - I didn't all have the linkages clear in my head at the time so could have been missing something simple. Might give it a try with the next grammar.