Open Wilfred opened 2 years ago
Looking at the objects, the main culprit are the precompiled vendor/*/src/parser.c
files. I doubt that this can be fixed without excluding these and either generating them at build-time with tree-sitter generate
from the grammar or just not vendoring them at all. Quite a few grammars are available on crates.io
already.
I don't know if releasing sub-crates like difftastic-language-xxx
is a good idea.
I don't know if releasing sub-crates like
difftastic-language-xxx
is a good idea.
That's not what I meant. There are quite a few tree-sitter-*
crates that one could depend on instead of vendoring them.
The majority of parsers in difftastic are either not available on crates.io, or the versions on crates.io are old.
I agree that the vendor/*/src/parser.c
files are the biggest, and the SQL parser is particularly big: https://github.com/m-novikov/tree-sitter-sql/issues/59
If difftastic just had a snapshot of each parser, it wouldn't have the history of these large files, substantially reducing the size.
Alternatively, maybe it would make sense to look at creating the parser.c files during the build too. This would enable usage of the new, faster ABI https://github.com/tree-sitter/tree-sitter/pull/1852 and it's already the case that the Swift parser doesn't have parser.c checked in.
Alternatively, maybe it would make sense to look at creating the parser.c files during the build too.
I prefer this way. I'm interested in implementing this, any notes for me?
I think dynamically loading the parsers is the way forward: https://github.com/Wilfred/difftastic/pull/356 & #123
Could Git submodules be used here? That way, you could link to a specific version of each dependency without embedding it directly into the repo.
This is too much. It makes CI slower and contributing slower.
The git subtrees are getting too big, we might have to rewrite history to use snapshots of vendored parsers.