gleam-lang / tree-sitter-gleam

🌳 A tree-sitter grammar for the Gleam programming language
Apache License 2.0
71 stars 13 forks source link

`#is-not? local` query cannot extend inference to multiple nodes #7

Closed J3RN closed 2 years ago

J3RN commented 2 years ago

I noticed some strange behavior with the syntax highlighting of records vs records with functions. Consequently, I tried to introduce this test:

fn thing() {
  let local_record = Bar(thing: fn(x) { x + 1 })
  local_record.thing(5)
  // ^ variable
  //           ^ property
}

However, this test fails with this error:

  ✗ functions.gleam
    Failure - row: 44, column: 15, expected highlight 'property', actual highlights: 'function'

So the syntax highlighter is able to determine that local_record is a variable, not a module, but it is not able to extend that inference to imply that therefore the second part of the field_access must be a property instead of a function.

the-mikedavis commented 2 years ago

~I think locals.scm is not smart enough to infer the typing of the record member. That might be something that you'd need stack graphs to tackle.~

~This case can be determined just from the syntax though, right?~

(function_call
  function: (field_access
    field: (identifier) @function))

~I'm not too familiar with gleam yet: is there another foo.bar() syntax other than function calls that would make that ambiguous?~

edit: actually I read this wrong :sweat_smile:

I think there's a highlights query that could cover this though, I'll try some out :+1:

the-mikedavis commented 2 years ago

I think this might be a bug in tree-sitter queries.

; highlights.scm
((function_call
   function: (field_access
     record: (identifier) @module
     field: (identifier) @function))
 (#is-not? local))

I would expect the (#is-not? local) predicate to prevent the stanza from capturing both @module and @function, but it appears to only be determining wether @module gets captured

J3RN commented 2 years ago

Right, which is quite weird. Since #is-not? local could be seen as ambiguous (are we asking about @module or @function?), I tried #is-not? local @module since it looks like the Rust query builder seems to support up to three arguments, but this had no effect. I might try diving deeper into the tree-sitter source later today to see if I can discover the root of this issue.

the-mikedavis commented 2 years ago

I was looking around and this slice of the documentation caught my eye:

When highlighting a file, Tree-sitter will keep track of the set of scopes that contains any given position, and the set of definitions within each scope. When processing a syntax node that is captured as a local.reference, Tree-sitter will try to find a definition for a name that matches the node's text. If it finds a match, Tree-sitter will ensure that the reference and the definition are colored the same. The information produced by this query can also be used by the highlights query. You can disable a pattern for nodes which have been identified as local variables by adding the predicate (#is-not? local) to the pattern...

In particular the way that the last sentence is worded suggests to me that the current behavior is intended. Based on the existing #is-not?s I see in the @tree-sitter repos, I don't think a use-case like this has come up before, so I think it might be worth posting a question issue in tree-sitter to see if they'd be open to a change that allows you to make a pattern conditional on the (#is-not? local @somecapture) which (as you linked) is currently supported.

I think the revelant change would be to the C source which I'm not at all familiar with. I'll keep looking to see how difficult this change might be. I suspect it can't be too difficult because the behavior we expect matches the behavior of #eq?/#match? and friends. If I can hunt it down I'll send a PR/issue into tree-sitter :+1:

the-mikedavis commented 2 years ago

I opened up https://github.com/tree-sitter/tree-sitter/issues/1597 to discuss it. It looks like the locals information is currently only determined and then used for highlights in a single pass over a QueryCursor. For the query we want, this isn't a problem, but for generally being able to use #is-not? local like #eq? or #match? or even just support multiple captures in a stanza, I think the locals tracking system would need to be more complex[^1], which sounds like is currently out of scope (haha) based on https://github.com/tree-sitter/tree-sitter/issues/918#issuecomment-775700421.

[^1]: The field_access query would work only because @module is captured before @function. Determining what to highlight the (identifier) based on the (label) (i.e. backwards) would mean you couldn't do highlights in a single pass without backtracking.

J3RN commented 2 years ago

@the-mikedavis Thank you for your diligent work on this! :tada: :confetti_ball: Seeing as https://github.com/tree-sitter/tree-sitter/pull/1602 is now merged, this issue is fit to be closed.

Once the latest version of tree-sitter-cli is pushed to NPM (see https://github.com/tree-sitter/tree-sitter/issues/1613), we should be able to merge #10 which adds a test for this functionality.