Closed nicolas-guichard closed 11 months ago
@nicolas-guichard tests are failing on CI
Indeed, I found the reason for the test failure:
When _type is defined as:
_type: $ => seq(
optional($.type_modifiers),
choice(
$.parenthesized_type,
$.nullable_type,
$._type_reference,
$.function_type,
$.not_nullable_type
)
)
then @Foo T & F
is parsed as:
_type @Foo T & F
type_modifiers @Foo
annotation @
user_type Foo
not_nullable_type T & F
user_type T
user_type F
But when _type uses a named field around the choice
:
_type: $ => seq(
optional($.type_modifiers),
field('type', choice(
$.parenthesized_type,
$.nullable_type,
$._type_reference,
$.function_type,
$.not_nullable_type
))
)
then @Foo T & F
is parsed as:
_type @Foo T & F
not_nullable_type @Foo T & F
type_modifiers @Foo
annotation @
user_type Foo
user_type T
user_type F
I don't really understand why adding a named field changes the precedence here.
types.txt
explicitly tests for the first one, so I assume that's what we want, although the Kotlin grammar is quite surprising here: not_nullable_type
is only ever used by type
, and type
already has an optional type_modifiers
, so why does not_nullable_type
need one too if it doesn't even have precedence?
I've removed the named field as a workaround for now, but I'm wondering if it would make sense to remove the optional type_modifiers from not_nullable_type to make the grammar less ambiguous.
@nicolas-guichard This is a bit concerning as we might potentially miss a similar test case. I guess we can't name fields of different types without creating a wrapping node.
I've looked at this issue where field names were introduced. https://github.com/tree-sitter/tree-sitter/pull/271
One thing to note is that tree-sitter parser <file>
will return a tree with all field names.
What I would suggest doing is instead of adding all possible fields right now. Start smaller instead with only a couple of fields you need BUT additionally updating unit tests to verify fields are correctly named in every single place.
I'm mainly interested with elements that can have an inner block, so I've done those (+ type_alias
because every other child of _declaration
had named fields so I thought it made sense for “symmetry”).
This is split into commits that should each pass the tests for easier reviewing, but, given that parser.c
is committed into the repo I think it would be wise to squash everything before merging, or extracting the parser.c
update into a separate commit, to avoid needlessly increasing the repo size.
@nicolas-guichard I'm not concerned about splitting this in multiple commits since it'll get squashed anyway. Thanks for adding the tests! If you can rebase on master I'll re-run tests and merge
We really need to get the parser size under control again... this PR bumped it from 22 MB to 70 MB. The Git CLI warns about it too:
remote: warning: File src/parser.c is 70.42 MB; this is larger than GitHub's recommended maximum file size of 50.00 MB
remote: warning: GH001: Large files detected. You may want to try Git Large File Storage - https://git-lfs.github.com./
I've reverted this PR for now. Feel free to rebase and open a new PR, but please make sure to keep the parser at a reasonable size, ideally < 30 MB, and also that the CI checks pass (specifically https://github.com/fwcd/tree-sitter-kotlin/issues/92).
Hi,
This is a continuation of /pull/74 given Vladimir Reshetnikov said they would not be pursuing it anymore.
In Searchfox, we leverage tree-sitter to add some information on where symbols are found. For instance if you search for the C++ symbol mozilla::dom::MediaControlKeySource::IsOpened, and you find 9 results in the file
dom/mediacontrol/MediaControlKeyManager.cpp
, it's super useful to know in which function each result was found (tree-sitter is what enables those// found in
comments). We are adding Kotlin support to Searchfox, hence this PR.For now we only use the
name
andbody
fields ofclass_definition
andfunction_declaration
, but since there was a pull request hanging around to add named fields everywhere I just rebased it on main. If you think it's too much (this adds almost 2M lines toparser.c
!), I can reduce the number of named fields. They could be useful to someone else though.