Closed mjambon closed 6 months ago
@dcreager @maxbrunsfeld @patrickt thoughts on eliminating the javascript-only grammar?
I still think it's valuable to have the JavaScript language explicitly implemented as its own smaller, simpler grammar. JavaScript is relatively stable, and has very precisely documented structure, whereas TypeScript is still constantly changing and has a much less thorough spec.
Are there other ways that we could make it easier to bump the JavaScript dependency the TypeScript grammar? I'm open to replacing the package.json
dependency with something that's not tied to nom, like a git submodule.
I think we need to ensure, at the time of committing a javascript fix, that the fix also works for typescript.
If we used tree-sitter-javascript as git submodule of tree-sitter-typescript, we'd need the javascript CI to run the javascript tests using the typescript parsers. The flow would be something like:
Maybe the git submodule solution is fine if (2) succeeds most of the time. I think step (3) is a bit too much to ask unless it's exceptional.
Another solution would be to merge the javascript and typescript repos but keep the grammars separate as they are today. This would be simpler for contributors. Running tests locally would automatically run the javascript tests with the typescript parser, which is more direct than having to rely on CI for it.
I recently mentioned this idea in our team, and one of things that was mentioned was that there are more "variants" of JS. Flow was mentioned as an example, which adds its own syntax (unfortunately even without changing the extension 😦).
In light of that, I think having a separate JS grammar is good, because it allows deriving other variant next to TS. If there would be more variants next to TS, having them all in the same repo might make developing the JS grammar harder, as issues with the derivatives need to be fixed immediately as part of the JS change.
@hendrikvanantwerpen I'm optimistic about a repo merge because the technical expertise needed to work on any of these projects is basically the same. As a result, we'd get a bigger contributor pool and happier contributors - well, at least the TypeScript and Flow contributors would be happier, and I'm sure they would be glad to assist in case of any difficulty with a new JavaScript feature.
I think we need to ensure, at the time of committing a javascript fix, that the fix also works for typescript. If we used tree-sitter-javascript as git submodule of tree-sitter-typescript, we'd need the javascript CI to run the javascript tests using the typescript parsers.
I would rather treat tree-sitter-javascript
as an independent project, and have the TypeScript grammar depend on a specific version of it, similar to the normal workflow for dependencies. Generally, it's fine to commit changes to tree-sitter-javascript
as long as the tree-sitter-javascript
tests pass.
The TypeScript grammar doesn't necessarily have to always immediately pull in the most recent commit of tree-sitter-javascript
. We can choose when to bump the version of tree-sitter-javascript
that tree-sitter-typescript
uses. And at the time of bumping the dependency, we need to make sure that the TypeScript tests pass.
I agree with Max on this — I think this issue is showing that we need to do a better job of assigning version numbers to the JS grammar, and to have more precise version ranges (or even specific pinned commit SHAs) on the TypeScript side. If we're making changes to the JS grammar that would break the TS grammar, those changes could also presumably break other consumers of the JS grammar, and need to communicated via appropriate version numbers.
I think this issue is showing that we need to do a better job of assigning version numbers to the JS grammar, and to have more precise version ranges
Who would do this work and when?
I personally think the git sha is the best way of locking the version down, either in the package.json, as we have it now, or just via a git sub module.
I don’t think there’s a meaningful way to encode grammar changes in semver. Strictly speaking, almost any change would be a major version bump. Managing the version adds a bunch of ceremony to the development process, and I don’t think it adds much value as compared to just using git revisions directly.
Maintaining tree-sitter-typescript as an extension of tree-sitter-javascript tends to involve extra work compared to a grammar written from scratch (#189, #190 + personal experience with semgrep).
Since typescript (tsx mode) is compatible with javascript and jsx, what's the value of maintaining a parser that supports only javascript? Do we have users who would be bothered if we only provided the tsx dialect as a javascript/jsx parser?
/cc @ruricolist