elixir-lang / tree-sitter-elixir

Elixir grammar for tree-sitter
https://elixir-lang.org/tree-sitter-elixir
Apache License 2.0
249 stars 25 forks source link

Incorrect parsing/highlighting of member access and function calls #72

Closed azizk closed 2 months ago

azizk commented 2 months ago

Current Behaviour

The grammar currently parses these expressions incorrectly as function calls:

Note that calling a function like this is deprecated in the newest Elixir versions.

Parsed correctly but not highlighted as a function call:

Expected Behaviour

I think the grammar should parse/highlight as follows:

# Parse as member access or at least don't highlight as function calls:
map.member
map.another_map.member
# Always parse and highlight as function calls:
function.()
map.function()
Module.function
# Always highlight as a function call:
something |> function
jonatanklosko commented 2 months ago

The grammar currently parses these expressions incorrectly as function calls:

  • map.member
  • map.another_map.member

This is expected. The parser mirrors Elixir AST, and map.member is parsed as a call, the same way as map.member().

We can however change the highlighting.

map.member
map.another_map.member

Currently another_map and member are highlighted as @function, which is indeed not correct. I believe the accurate highlight tag would be @property, which JavaScript and Rust highlights use for fields.

As for function.() I am not certain that it should be highlighted as function, because it is noticeably different from a regular function call. If we did that, it would be inconsistent with the highlighting in function = fn x -> x end (which we could also highlight as function, but I don't think we should). @josevalim wdyt?

the-mikedavis commented 2 months ago

I agree: syntax-wise the "foo" in foo.() is a variable rather than a function.

Updating the highlights for the right-hand-side of dot makes sense though especially if map.member is deprecated as a way to call a function

azizk commented 2 months ago

This is expected. The parser mirrors Elixir AST, and map.member is parsed as a call, the same way as map.member().

You're right. We can test this in iex with quote do map.member end and quote do map.member() end. The former has [no_parens: true].

We can however change the highlighting.

Great. :+1:

As for function.() I am not certain that it should be highlighted as function, because it is noticeably different from a regular function call. If we did that, it would be inconsistent with the highlighting in function = fn x -> x end (which we could also highlight as function, but I don't think we should). @josevalim wdyt?

I don't think it would be inconsistent because in general calling a function is different from defining it, or at least it's certainly helpful to make that distinction. So function = fn x -> x end should remain a variable declaration syntactically and visually (or maybe a function declaration visually). However, when we see function.() it clearly represents a function call and highlighting it as such would reflect what the code actually does. Personally, I'm confused if this visual cue is missing.

axelson commented 2 months ago

One thing to keep in mind is that you can try to call any variable as a function which will fail. For example: foo = 42; foo.(). In that case I don't think that it makes sense to highlight foo as a function because it isn't one. And while in simple cases like what I just showed we might be able to determine if a variable contains a function in the general case we cannot (as far as I understand). Maybe this could change with the type system, but that's still a ways away.

jonatanklosko commented 2 months ago

@axelson highlighting only works on syntax level, so foo.() (or mod.foo()) has the same meaning, regardless if it fails at runtime (or even compile time) :)

josevalim commented 2 months ago

This is expected. The parser mirrors Elixir AST, and map.member is parsed as a call, the same way as map.member().

For what is worth, we have an issue to change that in Elixir and we have already added a [no_parens: true] metadata. We want to make them have different AST sooner than later. I recommend this parser to trail ahead and emit different nodes for them.

(Elixir v1.15 emitted compile-time warnings, Elixir v1.17 emits runtime warnings based on the metadata).

azizk commented 2 months ago

One thing to keep in mind is that you can try to call any variable as a function which will fail. ...

You may also do that with the pipe operator as in data |> some_variable. That shouldn't prevent us from expecting it to be highlighted as a function call. It's out of the current scope of a tree sitter grammar to be able to know what a symbol really is. I believe it's possible to integrate basic semantic analysis into a syntax highlighter to enhance it, but in case of Elixir it may never be completely accurate due to its dynamic nature. But that's a different project and not a simple feature to build.

josevalim commented 2 months ago
  1. foo() is a local function call, no variables
  2. foo.() has foo as a variable holding a function
  3. foo.bar() has foo as a variable holding a module
  4. foo.bar has foo as a variable holding a map

So, in case of 2, if we can say that foo is a variable holding a function, that would be correct. But foo is not a function. It depends if we are doing static analysis or not (and we are being consistent with this analysis in 3 and 4).

azizk commented 2 months ago

The symbol foo in foo.() is a variable but it's used as a function. The highlighter usually cannot know what a symbol is (without expensive semantic analysis), so the best thing to do is to highlight things as what they most likely appear to be. To me it's not only visually appealing but important to highlight function calls consistently, because when I see the colour I know instantly something is being called. When foo.() isn't in that colour it looks strange and off to me.

Btw, while experimenting I found something curious:

iex> foo = %{bar: 1}
iex> foo.bar()
1

I would have expected an exception here (using Elixir 1.16.3).

In any case, we can see that we're able to put different values in these symbols and still use them in call expressions. The highlighter will never know what they hold at runtime, but it doesn't need to. It should aid us in making code more readable as best as it can.

josevalim commented 2 months ago

Elixir does not need semantic analysis. We can tell syntactically what is what and the code you pasted emits a warning on v1.17.

The root of my argument is in this screenshot:

Screenshot 2024-09-27 at 21 14 05

If you say that foo in the last example should also be colored in orange/green, because we know it is a module (and we can decide this syntatically), then I agree we should color it for function. But I think most people would find that confusing and surprising. So we should treat it as variable in all of the cases. We are either consistent in showing types or we show the syntactic information. I am leaning towards the syntax information (variable).

EDIT: Updated screenshot.

azizk commented 2 months ago

So you mean foo in foo.bar() could be colored pastel green because we know it must be a module? But we don't because it's a variable... I see what you mean. However, by the same reason shouldn't bar be uncolored as well, because it's actually a property of foo and therefore not a "function"? But we don't leave properties uncolored everywhere in order to be consistent. As soon as an identifier is followed by parentheses (or an argument) it's colored as a function call. That seems to be the defining rule and can also be considered to be consistent.

Besides, what to do about something like foo.bar.()? bar is a property and currently it's colored as a function call, but after the coming fix it will be uncolored text. Does that make sense?

Lastly, let's take a look at this example:

doc
|> next_break_fits(:enabled)
|> fun.()
|> next_break_fits(:disabled)

Doesn't it seem strange that fun is uncolored? Doesn't it break the "visual flow"? When we read the code, we see that doc is piped into a function and then into the variable fun, but there is no visual indication that something will be called. How is that visually appealing and helpful?

josevalim commented 2 months ago

The example does not look weird to me at all, either in GitHub or my editor (which also uses tree sitter) . Especially because the purple one tells me the name of the function that will be invoked. fun.() does not tell me the name of the function it will be invoked and I don't want to think there is a function named fun.

So yes, it looks both correct and visually appealing to me, even though I don't think we should go into the discussion of what is visually appealing, because it will be based on taste and that will not lead us anywhere. With the opened PR, it seems we will be consistent here, so I will close this in favor of the PR.

jonatanklosko commented 2 months ago

Besides, what to do about something like foo.bar.()? bar is a property and currently it's colored as a function call, but after the coming fix it will be uncolored text. Does that make sense?

After the PR bar in foo.bar.() will be highlighted as a property, which is consistent with foo in foo.() being highlighted as a variable.

Another perspective on the discussion is that semantically it's not about highlighting "calls", but rather "function" references. And it's clear that foo in foo.() references a variable.

azizk commented 2 months ago

Thanks for the discussion and the grammar changes so far! :pray:

Some good arguments were exchanged and I see why you want to keep foo.() a variable reference. In summary, it's because we don't respectively color foo in foo.bar() as a module and foo in foo.() is simply not a function reference.

We don't need to talk about taste, but I think we should still have a technical discussion (in a separate issue) about making a distinction between function declarations, calls and references. If we did that we could easily customize the appearance of these symbols to our liking in the editor of one's choice. :+1: