helixbass / tree-sitter-grep

The Unlicense
4 stars 1 forks source link

"Auto-detect" language? #22

Closed helixbass closed 1 year ago

helixbass commented 1 year ago

It seems possibly desirable to not have to specify the --language command line argument every time

Biggest "issue" would seem to be that the tree-sitter query presumably would only parse for the actual intended target language (maybe others?)

So maybe we know how to map from filename suffix to supported language (which knows how to map to eg the relevant tree-sitter grammar/parser) and as ignore is iterating over project files we cache whether we've already tried to parse the query for that recognized language and (a) if it failed to parse we skip the file (b) if it's cached then what we cached is the parsed query so we run that (c) if we haven't tried yet for that language then we try and parse with the relevant tree-sitter parser and cache the result/failure

Not sure if this plays weirdly with the "filter plugin" API at all? Are there language-to-language differences in the "raw" tree-sitter API that would make the filter plugin choke in weird ways? But (if so) maybe that's just on you if you decide you want to omit --language + use a filter plugin?