aMOPel / tree-sitter-nim

tree-sitter parser for the nim programming language
MIT License
36 stars 10 forks source link

Split queries into editor-specific files #18

Closed omentic closed 1 year ago

aMOPel commented 1 year ago

Good work!

Can't judge the helix queries.

I see some things I disagree with in the nvim queries, but also things I overlooked. I'm gonna merge those with my branch later.

Eg

((primary (symbol) @type)
 (#match? @type "[A-Z].+"))
; assume PascalCase identifiers to be types

I don't think it's a good idea to highlight based on regex here. You really can't assume that types are gonna be PascalCase. It is no restriction in the language, only a recommendation. Also constants are supposed to be PascalCase too.

omentic commented 1 year ago

Yeah, I was torn there. But as far as I can tell there's no other way to disambiguate identifiers in for example a case statement where both types and variables are commonly used (and the Rust query did it :shrug:).

omentic commented 1 year ago

I did try to pull in a bunch of stuff from your branch. I think the big thing I didn't merge was some disambiguation around routines (oh, and the ["(" ")"] portions - i didn't quite understand those): I think if we bother we should probably match verbatium on the keyword to tell apart macros, methods, etc, but it's a lot of code because we'd have to do it three times and it doesn't help highlighting at all.

aMOPel commented 1 year ago

and the ["(" ")"] portions - i didn't quite understand those)

Those are to match the parentheses of the function call etc too.

aMOPel commented 1 year ago

Alright, so my thoughts to your nvim highlight queries:

1.


(primary
  (primarySuffix
    (qualifiedSuffix
      (symbol) @variable.other.member)))
; overzealous, matches x in foo.x

"@variable.other.member" is not specified in the nvim tree sitter spec

2.

((primary (symbol) @type)
 (#match? @type "[A-Z].+"))
; assume PascalCase identifiers to be types

((primary
  (primarySuffix
    (qualifiedSuffix
      (symbol) @type)))
 (#match? @type "[A-Z].+"))
; assume PascalCase member variables to be enum entries

; COMMENT: this is especially untrue, since enum values are often in camelCase, eg `nkSym`.
; If anything, the enum type is in PascalCase.

As I said, I don't think the assumption is a good idea.

there's no other way to disambiguate identifiers in for example a case statement where both types and variables are commonly used

Maybe I don't understand something, but as far as I'm aware there can't be a type after the case keyword. Probably you are referring to enums, but then you might have a misunderstanding, since the enum itself is a type, yes, but what you put in a case stmt is a variable, whose type is an enum. However that variable can only hold one value of that type.

So enum values could be considered literals, like ints, however, since they have the syntax of an identifier, they are literally indistinguishable from variables by sheer syntax analysis, as they also appear in the same places.

I haven't understood the scope section in nvim treesitter yet, maybe that could help to distinguish enum values.

3.

(variable
  (keyw) @type.definition
  (declColonEquals (symbol) @variable))
; let, var, const expressions

I don't think that's the right usage of those captures. From the spec:

;@variable         ; various variable names
;@type.definition ; type definitions (e.g. `typedef` in C)

So I would capture the name of the variable in the definition as @variable and the keyword (let, var, const) as @keyword.

4.

(symbolEqExpr
  (symbol) @variable)
; named parameters

The symbol in named parameters in a function call is no variable, but a paramter. The value after the = can be a variable. There is a @parameter capture. But it's a good idea to capture the named parameters. I missed that. I added

(functionCall (symbolEqExprList (symbolEqExpr (symbol) @paramter)))

5.

(symbolColonExpr
  (symbol) @variable)
; object constructor parameters

Also a good idea to capture object fields in the constructor. I think the @field capture is more appropriate though. I added

(objectConstr (symbolColonExpr (symbol) @field))

6.

(identColon (ident) @variable)
; named parts of tuples

Same goes for tuple "fields". Dunno what would be a better term.

(tupleConstr (symbolColonExpr (symbol) @field))

7.

(primary
  . (symbol) @function.call
  . (primarySuffix (functionCall)))
; regular function calls

(primary
  . (symbol) @function.call
  . (primarySuffix (cmdCall)))
; function calls without parenthesis

(primary
  (primarySuffix (qualifiedSuffix (symbol) @function.call))
  . (primarySuffix (functionCall)))
; uniform function call syntax calls

Very good. I didn't think about using . there.

8.

[(operator) (opr) "="] @operator

Makes sense to capture "=". I added

(declaration (variable (declColonEquals "=" @operator)))
(exprStmt "=" @operator)

9.

[
  "("
  ")"
  "["
  "]"
  "{"
  "}"
  "{."
  ".}"
  "#["
  "]#"
] @punctuation.bracket

I disagree with {..} #[]#. Those fit better into @pragma and @comment, respectively. Also I think it's better to precise here. I have:

(tupleConstr ["(" ")"] @punctuation.bracket   )
(arrayConstr ["[" "]"] @punctuation.bracket   )
(tableConstr ["{" "}"] @punctuation.bracket   )
(setConstr ["{" "}"] @punctuation.bracket   )
(genericParamList ["[" "]"] @punctuation.bracket   )
; TODO: doesn't work with ["(" ")"] because of token schenanigans in grammar
(indexSuffix)  @punctuation.bracket

Gonna continue later.

aMOPel commented 1 year ago

10.

[(literal) (generalizedLit)] @constant

from the spec: ;@constant ; constant identifiers So I think @constant is about const variables and not about literals. After all there are several capture groups for literals. I have this:

(declaration (constant (keyw) @keyword (declColonEquals (symbol) @constant)))

11.

[(nil_lit)] @constant.builtin

Again, not a constant. It could be handled as a keyword or as literal. I settled on the latter and threw it with the bool_lit.

[
(bool_lit)
(nil_lit)
]
@boolean              ; boolean literals

12.

[(int_lit) (int_suffix)] @number
[(float_lit) (float_suffix)] @float

Very good. I added the suffixes.

13.

[(str_lit) (triplestr_lit) (rstr_lit)] @string
[(generalized_str_lit) (generalized_triplestr_lit) (interpolated_str_lit) (interpolated_triplestr_lit)] @string.special

from the spec:

;@punctuation.special   ; special symbols (e.g. `{}` in string interpolation)
;@string.special       ; other special strings (e.g. dates)

So I have:

(interpolated_str_lit "&" @punctuation.special)
(interpolated_str_lit "{" @punctuation.special)
(interpolated_str_lit "}" @punctuation.special)
;@punctuation.special   ; special symbols (e.g. `{}` in string interpolation)

[
(str_lit)
(rstr_lit)
(triplestr_lit)
(interpolated_str_lit)
(interpolated_triplestr_lit)
(generalized_str_lit)
(generalized_triplestr_lit)
]
@string               ; string literals

But I added the generalized string literals, as I forgot them.

14.

(typeDef
  (keyw) @type.definition
  (symbol) @type)

I would capture the keyword as @keyword and the symbol as @type.definition. From the spec:

;@type.definition ; type definitions (e.g. `typedef` in C)

From the the c highlight queries:

(type_definition
  declarator: (type_identifier) @type.definition)

So I have:

(typeDef (symbol) @type.definition)

15.

(primarySuffix
  (indexSuffix
    (exprColonEqExprList
      (exprColonEqExpr
        (expr
          (primary
            (symbol) @type))))))

IndexSuffix is a problem. In the official grammar it is used for both indexing (a[5]) as well as generic parameters (array[5, string]). I should split it in the future. Another issue with it is that currently it is using token.immediate() which kind of swallows the opening [, meaning you can't highlight that one separately. I added:

(primaryTypeDesc 
  (primarySuffix
    (indexSuffix
      (exprColonEqExprList
        (exprColonEqExpr
          (expr
            (primary
              (symbol) @type)))))))
(primaryTypeDef 
  (primarySuffix
    (indexSuffix
      (exprColonEqExprList
        (exprColonEqExpr
          (expr
            (primary
              (symbol) @type)))))))

for this special case

16.

(primaryTypeDef
  (symbol) @type)
; primary types of type declarations (nested types in brackets are matched with above)

(primaryTypeDesc
  (symbol) @type)
; type annotations, on declarations or in objects

(primaryTypeDesc
  (primaryPrefix
    (keyw) @type))
; var types

(genericParamList
  (genericParam
    (symbol) @type))
; types in generic blocks

(enumDecl
  (keyw) @type.definition
  (enumElement
    (symbol) @type.enum.variant))

(tupleDecl
  (keyw) @type.definition)

(objectDecl
  (keyw) @type.definition)

@type.enum.variant doesn't exist in nvim. As I said in 3. and 14., @type.definition is for the identifier not the keyword. I have

(primaryTypeDef (symbol) @type)
(primaryTypeDesc (symbol) @type)
(genericParam (symbol) @type)
(tupleDecl (keyw) @type)
(enumDecl (keyw) @type)
(objectDecl (keyw) @type)
(conceptDecl (keyw) @type)

17.

(objectPart
  (symbol) @variable.other.member)
; object field

Very good. I added:

(objectDecl (objectPart (symbol) @field))
(tupleDecl (identColon (ident) @field))

18.

(objectCase
  (keyw) @conditional
  (symbol) @variable
  (objectBranches
    ; (objectWhen (keyw) @conditional)?
    (objectElse (keyw) @conditional)?
    (objectElif (keyw) @conditional)?
    (objectBranch (keyw) @conditional)?))

Great. Added:

(objectCase (keyw) @conditional (symbol) @variable)
(objectBranch (keyw) @conditional)
(objectElif (keyw) @conditional)
(objectElse (keyw) @conditional)
(objectWhen (keyw) @conditional)

I noticed that it would match the same section multiple times because of the ? and other quantifiers, so I try to avoid them. Also this is a little more concise.

Gonna continue later.

aMOPel commented 1 year ago

19.

(conceptDecl
  (keyw) @type.definition
  (conceptParam
    (symbol) @variable))

I added:

(conceptDecl (keyw) @type)
(conceptParam (keyw)  @type.qualifier)
(conceptParam (symbol) @variable)

20.

((exprStmt
  (primary (symbol))
  (operator) @operator
  (primary (symbol) @type))
 (#match? @operator "is"))

Smart! I added

((exprStmt
  (primary (symbol))
  (operator) @keyword.operator
  (primary (symbol) @type))
 (#match? @keyword.operator "is"))
((expr
  (primary (symbol))
  (operator) @keyword.operator
  (primary (symbol) @type))
 (#match? @keyword.operator "is"))

21.

(blockStmt
  (keyw) @repeat
  (symbol) @label)

BlockStmt has nothing to do with @repeat. It's just to open a new scope.

22.

(forStmt
  (keyw) @repeat
  (symbol) @variable
  (keyw) @repeat)

Good stuff. I added:

(forStmt
.  (keyw) @repeat
.  (symbol) @variable
.  (keyw) @repeat)

23.

(importStmt
  (keyw) @include
  (expr (primary (symbol) @include)))
(importExceptStmt
  (keyw) @include
  (expr (primary (symbol) @include)))
(exportStmt
  (keyw) @include
  (expr (primary (symbol) @include)))
(fromStmt
  (keyw) @include
  (expr (primary (symbol) @include)))
(includeStmt
  (keyw) @include
  (expr (primary (symbol) @include)))

from the spec: ;@include ; keywords for including modules (e.g.import/fromin Python) So I would only capture the keywords.

24.

(breakStmt (keyw) @keyword.return)
(continueStmt (keyw) @keyword.return)

I added them to @repeat instead.

aMOPel commented 1 year ago

I haven't understood the scope section in nvim treesitter yet, maybe that could help to distinguish enum values.

https://github.com/nvim-treesitter/nvim-treesitter/issues/3098

Not yet apparently.