tree-sitter / py-tree-sitter

Python bindings to the Tree-sitter parsing library
https://tree-sitter.github.io/py-tree-sitter/
MIT License
817 stars 96 forks source link

[Draft] WASM support #272

Open CGamesPlay opened 1 month ago

CGamesPlay commented 1 month ago

Hello, I've been experimenting with what it would take to add wasm support to py-tree-sitter and wanted to get some feedback. Is there interest in merging this feature with py-tree-sitter?

I've made the top-level API similar to what a user of wasmtime would expect it to look like, basically you just pass the wasm module bytes and the engine into a separate constructor for Language (Language.from_wasm). After that, the usage is identical to normal Language objects. I've modified Parser to automatically manage the wasm store object, since it doesn't have any configurable parameters.

My goal with this change is to keep wasmtime as an optional dependency. Wasmtime's binary wheels are about 5 MiB, whereas tree-sitter's are only 0.5 MiB, so for that reason alone it seems like this niche feature should be optional rather than mandatory. Supporting optional dependencies at compile-time is not supported, so in order to add support I have to create trampoline functions which get populated at runtime by pulling them out of wasmtime's ffi module (wasmtime.c handles this).

There isn't a good way to land this change until wasmtime v24 lands, because it isn't possible to clone a reference to an Engine before I added a method to do so. Without that ability, there can only be a single language per engine, and only a single parser per language.

So, if there is interest in getting this into the main distribution, I can help close out the remaining issues with this PR. If there isn't, I can fork the project and make a separate variant which supports the functionality. Thanks for your consideration!

In order to land this PR:

ObserverOfTime commented 1 month ago

I'll consider this one after completing the current TODOs.

To anyone interested in this PR, please react to it with :+1:.