Closed ambv closed 6 years ago
The parser already handles all of the type annotation syntax that I'm aware of. For example, the code snippet that @ambv posted above would parse like this:
(module [0, 0] - [4, 0]
(function_definition [0, 0] - [4, 0]
(identifier [0, 4] - [0, 25])
(parameters [0, 25] - [2, 1]
(typed_parameter [1, 4] - [1, 14]
(identifier [1, 4] - [1, 8])
(type [1, 10] - [1, 14]
(identifier [1, 10] - [1, 14])))
(typed_default_parameter [1, 16] - [1, 34]
(identifier [1, 16] - [1, 20])
(type [1, 22] - [1, 26]
(identifier [1, 22] - [1, 26]))
(false [1, 29] - [1, 34]))
(typed_default_parameter [1, 36] - [1, 65]
(identifier [1, 36] - [1, 40])
(type [1, 42] - [1, 60]
(subscript [1, 42] - [1, 60]
(identifier [1, 42] - [1, 52])
(identifier [1, 53] - [1, 59])))
(tuple [1, 63] - [1, 65])))
(type [2, 5] - [2, 19]
(subscript [2, 5] - [2, 19]
(identifier [2, 5] - [2, 13])
(identifier [2, 14] - [2, 18])))
(pass_statement [3, 4] - [3, 8])))
(sorry, deleted my comment since I misunderstood)
If I understand now, we just need named rules for different parts to map to scopes in the language grammar?
I think @ambv wants to have separate nodes for the brackets so they can be scoped differently.
Something like changing:
subscript: $ => seq(
$._primary_expression,
'[',
commaSep1(choice($._expression, $.slice)),
optional(','),
']'
),
to
subscript: $ => seq(
$._primary_expression,
$.lookup
),
lookup: $ => seq(
'[',
$.lookup_exp,
']'
),
lookup_exp: $ => seq(
commaSep1(choice($._expression, $.slice)),
optional(','),
),
(and apologies, I just started looking into tree-sitter today for another language)
Yeah, no worries! This whole system is pretty new so there's not much documentation yet.
I actually don't think we need to make any changes to the parser for this issue; we just need to configure certain tokens like :
and =
to be highlighted in Atom. That configuration lives here. We probably just need to add some more lines similar to these, describing what classes we want to apply to those tokens.
In that scopes
object, the keys are CSS selectors that select nodes in the syntax tree, and the values represent the list of classes to apply to those nodes for syntax highlighting. The syntax for referring to anonymous tokens (ones like :
and =
that don't have names in the grammar) is to surround them with double quotes.
Gotcha. So something like?
scopes:
'subscript > "["': 'punctuation.definition.arguments.begin.python'
'subscript > "]"': 'punctuation.definition.arguments.end.python'
I addressed this by naming the anonymous tokens in tree-sitter-python.cson in language-python. See my pull request there.
By punctuation I essentially mean brackets, dots, commas, colons. This makes it hard to color them differently from regular text.
Example where coloring punctuation would make text easier to read: