ionide / tree-sitter-fsharp

F# grammar for treesitter
MIT License
81 stars 18 forks source link

Simplifying identifiers #75

Closed KaranAhlawat closed 1 month ago

KaranAhlawat commented 1 month ago

This is a semi-desperate issue, but is there some way we could specify identifiers in a simpler manner? Currently, the long_identifier_or_op node has a few (3? 4?) valid combinations, and then they should be highlighted differently in different contexts, and it kinda just ends up exploding from there.

This isn't really a bug, but more of a would be nice-to-have. Please let me know if it's not possible and I'll close the issue.

Nsidorenco commented 1 month ago

I'm all for simplifying the structure, if it can make working with the AST easier. Could you perhaps give some examples of the highlighting queries you're trying to do, and why it blows up for you? Then we can work for there and see if we can simplify the structure, without introducing ambiguities

KaranAhlawat commented 1 month ago

One example is this query I'm writing (which I haven't been able to get to work so far)

    (application_expression
      (dot_expression
       base:
       (long_identifier_or_op
        :anchor
        [((long_identifier
           :anchor
           [((identifier) @font-lock-variable-use-face
             ("." (identifier) @font-lock-property-use-face):*
             "." (identifier) @font-lock-function-call-face :anchor)
            (identifier) @font-lock-function-call-face] :anchor))
         ((long_identifier
           (identifier) @font-lock-variable-use-face
           (identifier) :* @font-lock-property-use-face)
          :anchor (identifier) @font-lock-property-use-face :anchor)
         (identifier) @font-lock-variable-use-face])
       field:
       (long_identifier_or_op
        :anchor
        [((long_identifier
           :anchor
           [((identifier) @font-lock-property-use-face
             (identifier):* @font-lock-property-use-face
             "." (identifier) @font-lock-function-call-face :anchor)
            (identifier) @font-lock-function-call-face]) :anchor)
         ((_) @font-lock-property-use-face
          :anchor (identifier) @font-lock-function-call-face :anchor)
         (identifier) @font-lock-function-call-face])))

It tries to highlight the first identifier in a dot expression chain as a variable, all but the last field access (property/method) as property access and the last as function call (if inside application_expresssion).

The most complex part of this query comes from the long_identifier_or_op children, and it's possible combinations.

Nsidorenco commented 1 month ago

I think we are going to hit this issue with more than just long_identifier_or_op, there are so many nested options of how to colour application expressions etc.

For example, to colour the function call a subset of the colourings we could consider:

(application_expression
  .
  [
    (long_indentifier_or_op ...) @function.call
    (dot_expression ...) @function.call
    (paren_expression ...) @function.call
    (paren_expression (dot_expression ...) @function.call)
  ]
)

Particularly, the paren_expression with nested dot_expression could be nested arbitrarily many times 😅

I think a better solution might be to only highlight a construct at its most specific location, i.e.

(dot_expression . base: (_) @module)
(long_identifier_or_op . (identifier)+ @module (identifier))
(application_expression . (_) @function.call)
KaranAhlawat commented 1 month ago

Yeah, I definitely agree with you that it's not entirely easy to make sure what's what when F# has such flexibility.

I'll try the second approach and see where that lands me.

KaranAhlawat commented 1 month ago

One such query I came up with that works decently is this, as you can see it's quite long.

((application_expression
       (dot_expression
        base:
        [
         (long_identifier_or_op
          :anchor
          (long_identifier
           :anchor
           (identifier) @font-lock-type-face
           :anchor
           (identifier):* @font-lock-property-use-face
           :anchor
           (identifier) @font-lock-property-use-face
           :anchor)
          :anchor)
         (long_identifier_or_op
          :anchor
          (long_identifier
           :anchor
           (identifier) @font-lock-type-face
           :anchor)
          :anchor)
         (long_identifier_or_op
          :anchor
          (long_identifier
           :anchor
           (identifier) @font-lock-type-face
           (identifier):* @font-lock-property-use-face)
          :anchor (identifier) @font-lock-property-use-face
          :anchor)
         (long_identifier_or_op
          :anchor
          (identifier) @font-lock-type-face
          :anchor)
         ]

        field:
        [
         (long_identifier_or_op
          :anchor
          (long_identifier
           :anchor
           (identifier) @font-lock-property-use-face
           :anchor
           (identifier):* @font-lock-property-use-face
           :anchor
           (identifier) @font-lock-function-call-face :anchor)
          :anchor)
         (long_identifier_or_op
          :anchor
          (long_identifier
           :anchor
           (identifier) @font-lock-function-call-face :anchor)
          :anchor)
         (long_identifier_or_op
          :anchor
          (_) @font-lock-property-use-face
          :anchor (identifier) @font-lock-function-call-face :anchor)
         (long_identifier_or_op
          :anchor
          (identifier) @font-lock-function-call-face
          :anchor)
         ]))
      (:match "^[A-Z0-9]" @font-lock-type-face))
KaranAhlawat commented 1 month ago

I just went with your simpler capturing approach, so closing this issue.