grantjenks / py-tree-sitter-languages

Binary Python wheels for all tree sitter languages.
Other
139 stars 36 forks source link

Language detection #21

Open tmm1 opened 1 year ago

tmm1 commented 1 year ago

:wave: first off, huge thanks for putting this package together!

i'm wondering, with all these languages available what is the recommended way to pick a parser/language for a given file?

i see that each language implementation has a package.json section for tree-sitter configuration:

https://github.com/tree-sitter/tree-sitter-python/blob/master/package.json#L28-L32 https://github.com/latex-lsp/tree-sitter-latex/issues/19

perhaps the build process could pluck out these entries and make them available? so then a user could simply apply the file_types and content_regex rules to figure out what language to use.

tmm1 commented 1 year ago

i'm wondering, with all these languages available what is the recommended way to pick a parser/language for a given file?

there's some logic in the tree-sitter cli that does this, but unfortunately its not part of the actual library

i guess most people are using linguist or integrating into editor environments where they already have textmate compatible language detection.

seems like it may be beneficial to port and bundle the detection code in this python package, so users don't have to reimplement it. wdyt?

here's the reference impl:

https://github.com/tree-sitter/tree-sitter/blob/b8f7645ae2a5e240e67f968c89328af280055c9f/cli/loader/src/lib.rs#L207-L223

cc @nathansobo do i have this right?

grantjenks commented 1 year ago

I’m open to a PR but unlikely to do it myself.