sublime-treesitter / TreeSitter

Sublime Text Tree-sitter configuration and abstraction layer
MIT License
19 stars 1 forks source link

Install tree_sitter package via Package Control #2

Closed deathaxe closed 7 months ago

deathaxe commented 7 months ago
  1. tree_sitter package has been made available to be installed and upgraded via Package Control. Current pip install tree_sitter invokation could be replaced by adding a dependencies.json to package root.

  2. tree_sitter_languages provides pre-compiled languages, which could make the building steps and thus the whole external python dependency tree obsolete.

If you don't want to vendor those in your package, you can add a dependencies.json with following content and Package Control will care about keeping them up-to-date.

{
    "*": {
        ">=4000": [
            "tree_sitter",
            "tree_sitter_languages"
        ]
    }
}

The big advantage with this aproach was a out-of-the-box working experience for end users without bothering with external python 3.8 interpreter.

see:

  1. https://github.com/packagecontrol/channel/commit/027e9c21a036438dcb90c5f904682ee826ed17f2
  2. https://github.com/packagecontrol/channel/commit/0d0f5d5b4b6537719dab8128ebd39637fcf105df
kylebebak commented 7 months ago

Hey @deathaxe , thank you for making tree_sitter installable as a "dependency". I'll modify STS to depend on this. This removes the dependency on an external pip, which means we can remove the pip_path setting


Also, thank you for looking into tree_sitter_languages, I hadn't seen this package before

The upside to depending on tree_sitter_languages is that STS users wouldn't require an external Python, and we could remove the python_path setting. This would be a very big win for usability; for the languages pre-built in tree_sitter_languages, STS would just work

The downside is loss of configurability and completeness. tree_sitter_languages downloads language repos based on what's in this file, then builds the .so/.dll files. There are hundreds of well-maintained Tree-sitter language repos. Neovim supports almost 250 out of the box. tree_sitter_languages only supports 48. For example, it's missing vue, svelte and scss

STS currently supports 32 languages out of the box, but it can be extended to support any number of languages using the language_name_to_repo setting. Upgrading to a new version of a language can be done with the tree_sitter_update_language command. Versions can also be pegged to a commit has or branch. If we depend on tree_sitter_languages users lose this flexibility

Regarding pre-building Tree-sitter binaries and removing the external Python dependency... I considered this while I was working on STS, but I couldn't think of a way to do it without taking control away from users. I think it's better for early users to have more control, and let the community figure out how to improve the UX (I'm not sure tree_sitter_languages is the way to do that)

cc @kaste @braver @keith-hall

deathaxe commented 7 months ago

I see this package intenting max possible flexibility. A major part is package manager like functionality for grammers. That's basically a good thing. It just wasn't clear to not only need a dedicated python 3.8, but also MS build tools. That's probably less of an concern on Lunux/MacOS, but on Windows it's not so common to have the correct compiler being present. I can only guess most users might struggle with it, too. I may be wrong, though.

The possibility to compile syntaxes, locally, doesn't need to be dropped. Maybe it can be made optional, by also shipping a working starting kit, which doesn't require local build steps to get most common syntaxes. tree_sitter_languages is a possible but maybe not the best solution.

Helix e.g. ships precompiled dlls for each language, it supports.

Maybe there's a way to crawl all/most known repos frequently, comile languages upstream and deploy compiled binaries. So TreeSitter would just need to download binaries for each grammer.

One idea is to use Github Actions to frequently crawl a defined set of repos, pre-compiles languages for all supported platforms and deploy them via Github pages (to have static links). It wouldn't probably not allow to install a certain version of a grammer, but could be a good enough approach for most users.

kaste commented 7 months ago

I would definitively pull tree_sitter_languages as this provides the best first impression/user experience. It is the most welcoming approach, especially as even if the plugin provides compilations it still needs setup and having python3.8 out-of-box is already not common since it is too old.

(I recommend rye on all platforms because it downloads python versions that are clever build (from https://github.com/indygreg/python-build-standalone) and never interferes with the system python. If you already have pyenv use of course whatever you want but if not choose (the newer) rye.)

I would then imagine that the settings have a

    installed_languages: {
        "python": "path/to/a/compiled/language/file"
    }

instead of the list. These files take precedence over the ones in tree_sitter_languages.

The plugin could still do the compilation and fill in the values programmatically but the format would allow an user to just point to the correct files quickly.

(Maybe you already have them for neovim et.al.) (Or you hack on tree_sitter.)

kylebebak commented 7 months ago

This is closed by this release: https://github.com/sublime-treesitter/TreeSitter/releases/tag/1.1.0

STS now bundles tree_sitter and tree_sitter_languages as dependencies. Setting python_path is no longer required. Thanks deathaxe and kaste

See more context here: https://github.com/wbond/package_control_channel/pull/8862#issuecomment-1916027470