grantjenks / py-tree-sitter-languages

Binary Python wheels for all tree sitter languages.
Other
149 stars 43 forks source link

Rewrite for py-tree-sitter 0.22 #65

Open ObserverOfTime opened 4 months ago

ObserverOfTime commented 4 months ago

This PR rewrites the project to make it work with py-tree-sitter 0.22.

Fixes #55, closes #60, fixes #61, fixes #63, fixes #64, fixes #67


Don't take this to mean I will be maintaining this project going forward.

If @grantjenks is still unable to maintain it and doesn't want to give it to some random person, then someone can fork it, pull my changes, rename the package, and maintain the fork.

But don't try to package all ~400 languages. Just some popular ones that don't yet have a package.

mbhavya commented 3 months ago

This PR rewrites the project to make it work with py-tree-sitter 0.22.

* The user-facing API is still the same.

* Each language is compiled into a separate module.

* Adding new languages is much easier than before (see CONTRIBUTING.md).

* Languages that can currently be installed from pypi (or built from git) are removed.

* Wheels for niche architectures are no longer built (just like py-tree-sitter).

Fixes #55, closes #60, fixes #61, fixes #63, fixes #64, fixes #67

Don't take this to mean I will be maintaining this project going forward.

If @grantjenks is still unable to maintain it and doesn't want to give it to some random person, then someone can fork it, pull my changes, rename the package, and maintain the fork.

But don't try to package all ~400 languages. Just some popular ones that don't yet have a package.

Hi @ObserverOfTime great work.

Can you please elaborate how do you decide which languages make the cut and which don't ? As I understand, different developers will have different requirements.

Also, I understand maintaining so many languages would be impractical, but shouldn't then there be a way to help a developer install additional language support at runtime (post installation of package via pip) and not have to go via PR and build process to get that language support ? Maybe it can be achieved via documentation highlighting detailed steps.

ObserverOfTime commented 3 months ago

You can add any grammar you want here (see CONTRIBUTING.md for the process) but I chose not to add those that have been updated upstream and can be installed from pypi or git.

You can also submit a PR to grammars that have not been updated:

  1. Run tree-sitter generate to generate the new bindings.
  2. Fix inconsistencies in the generated bindings (e.g. versions & scanners).
  3. Optionally add CI & packaging workflows (will require the maintainer to set up their tokens).
logan-markewich commented 3 months ago

Is there a plan to merge and release this PR? As it stands, the package isn't usable with latest versions until this is fixed

Goldziher commented 3 months ago

@ObserverOfTime are you willing to release your own package instead of this one?

ObserverOfTime commented 3 months ago

No.

Goldziher commented 3 months ago

Ok, thank you @ObserverOfTime - I will fork your PR and publish the package, after some adjustments.

Goldziher commented 3 months ago

i really like what you did in the PR, so i am using most of it (i will give due credit in the readme for sure!).

I am going in a different direction though - i am creating a giant language pack for tree sitter. First python, but it can also have other bindings. Its the use case I have - I need to parse all sorts of code types and chunk them for AI processing, currently in python.