elixir-lsp / vscode-elixir-ls

Elixir language support and debugger for VS Code, powered by ElixirLS.
https://marketplace.visualstudio.com/items?itemName=JakeBecker.elixir-ls
MIT License
537 stars 106 forks source link

Improve elixir grammar #22

Closed thiagomajesk closed 4 years ago

thiagomajesk commented 4 years ago

Hi! I was wondering if you have plans to improve the current elixir grammar.

I'm asking because I was recently extending some elixir color schemes and noticed that some tokens are missing or don't have a distinction from others. For instance, both key: (parameter) and :key (atom) are defined as source.elixir > constant.other.symbol.elixir which prevents me from highlighting only atoms.

I was looking for solutions and saw that the ruby extension has a little bit more definitions: https://github.com/rubyide/vscode-ruby/blob/master/packages/vscode-ruby/syntaxes/ruby.cson.json (if you search the page for "hashkey" you'll see some good examples).

axelson commented 4 years ago

Aren't both key: and :key refer to the same :key atom? e.g. [key: "hello"] == [{:key, "hello"}].

Are there any other specific deficiencies in the grammar?

thiagomajesk commented 4 years ago

Hi @axelson! Thanks for the quick response. A better example would be comparing def fun(x), do: x and [key: "hello"] == [{:key, "hello"}]. Currently is not possible to highlight only atoms. In the previous example, both do: and key: or :key would be colored the same because they are both a constant.other.symbol.elixir. So I guess having this definition for atoms would be nice to optionally extend a color scheme and create this distinction (even though in your example they are conceptually the same).

Are there any other specific deficiencies in the grammar?

I'm still testing how flexible the current grammar is regarding syntax highlight, so something else might appear along the way. Right of the bat, I can think of:

How do you feel about keeping this issue open to track those improvements? Also, I think that looking at what they've done in the ruby grammar could be useful.

PS.: Do you know if there is an easy way to test the grammar individually in vscode?

Update:

Note: It's possible to test the grammar modifications by simply changing the ~/.vscode/extensions/syntaxes/elixir.json and reloading vscode.

1) For atoms, I've tested changing this configuration's capture names:

https://github.com/elixir-lsp/vscode-elixir-ls/blob/0a5e2a3f3814e52610197f361ee754b914df486e/syntaxes/elixir.json#L2380-L2389

To punctuation.definition.atom.elixir and constant.language.atom.elixir, so it's has a distinction from the captures that highlight keys.

2) For function calls, I've tested changing this configuration's capture names:

https://github.com/elixir-lsp/vscode-elixir-ls/blob/0a5e2a3f3814e52610197f361ee754b914df486e/syntaxes/elixir.json#L157-L171

From entity.name.function.elixir to entity.name.function-call.elixir (like ruby's grammar). Also, we would probably need to add a capture group for function calls with and without parenthesis and probably add this change in other places like the "Erlang Function call without parenthesis" section.

In my tests, this regex is actually capturing functions with parenthesis and only the ones prefixed by a module/ namespace (Module.call(%Task{}) and not call(%Task{})).

thiagomajesk commented 4 years ago

For namespaces and modules we could add something like this:

{
    "captures": {
        "1": {
            "name": "keyword.other.special-method.elixir"
        },
        "2": {
            "name": "entity.name.namespace.elixir"
        },
        "3": {
            "name": "entity.name.type.module.elixir"
        }
    },
    "match": "(?<keyword>(?:defmodule|use|import|require|alias)){1}\\s(?<namespace>(?<module>\\b[A-Z]\\w*\\.?)+)"
}

For visualization purposes, this will be matched like this:

code

Noticed that the part after defmodule is not coloring the namespace. I guess we should probably change the code bellow to capture only the module name for this to work right:

https://github.com/elixir-lsp/vscode-elixir-ls/blob/0a5e2a3f3814e52610197f361ee754b914df486e/syntaxes/elixir.json#L13-L22

@axelson Besides the things I mentioned in the previous comment if you help me validate this I can PR those changes. What do you think?

CaiqueMitsuoka commented 4 years ago

Hey @thiagomajesk nice to see you looking into improve the function names detection on the elixir grammar.

Let me see if I can help you on anything.

First of all I made de PR adding it quite recently PR#15, take some time to look into it.

There are some open conversation yet in the 2 PRs that did to the grammar for sublime which Valim takes care of. I recommend you look into that too. elixir-editors/elixir-tmbundle PR#179 elixir-editors/elixir-tmbundle PR#181

About calling the definition and the actual call the same scope, that was a decision that I took base on the TextMate grammar manual but people are asking for a differentiation and I think its fair enough to give people what they want, I will start working on it, and will take your thoughts in consideration and ask for you review 🙇 .

The case of the function calls without parenthesis is that they are ambiguous with plain variables. Only when a the call has a module that it is explicitly a function call(e.g Module.call) So why support all other types of call if the implicit stays out of support? Because the ideia is that it will be remove and we don't want to encourage the use of it. The compiler throws a warning and the formatter adds the parenthesis by default.

Your examples Module.call(%Task{}) and call(%Task{}) works fine here. But is actually it falls into other rule, that was included together.

{
  "match": "\\b[a-z_]\\w*[!?]?(?=\\s*?\\.?\\s*?\\()",
  "name": "entity.name.function.elixir"
}
axelson commented 4 years ago

Noticed that the part after defmodule is not coloring the namespace. I guess we should probably change the code bellow to capture only the module name for this to work right:

Elixir doesn't actually have namespaces so I probably won't merge any syntax changes that make it appear as if it does.

thiagomajesk commented 4 years ago

Hey, @CaiqueMitsuoka! Thanks for the reply.

I guess I saw the code you mentioned while I was tinkering around - thanks for the explanation, though 😄 .

About the function calls: Since the ambiguity only happens if we don't have parameters that follow (as per the compile warning), I guess we could do something about the calls that have params!? I don't know actually how complex would be to capture this because I'm not very good with regex. But the code below is a good example of what I'm talking about:

code

Right now, both calls to say_hello are treated with the source.elixir scope while just the first one is actually ambiguous to the compiler.

I will start working on it, and will take your thoughts in consideration and ask for you review 🙇

And I'm glad to help whatever I can. BTW, in this next batch, are you perhaps inclined to consider the other points I've made besides the change in function calls? I mean especially the changes in the syntax for atoms and namespaces I've mentioned before.

thiagomajesk commented 4 years ago

Elixir doesn't actually have namespaces so I probably won't merge any syntax changes that make it appear as if it does.

@axelson I don't want to argue about semantics but isn't the naming convention made like this to emulate the same hierarchy and organization that namespaces provide? Furthermore, even though Elixir does not use namespaces as a built-in language construct, this nomenclature is used in the docs, feature announcements, books, and style guidelines.

Based on that, I guess it wouldn't hurt to include a "namespace" scope bellow the meta.module.elixir. This way the current coloring will be the default if you don't specify a config for the namespace scope (it's basically the user's choice to use it or not with a sane behavior by default).

I personally would very much appreciate having at least the option of doing so because I find my code would be much more readable making that distinction.

There are also some color schemes with strong colors for module names, and the only way to prevent an epilepsy-prone syntax highlight is by aliasing everything and not allowing "namespaces" in function calls, for example.

axelson commented 4 years ago

Partially solved in #40. As mentioned at https://github.com/elixir-lsp/vscode-elixir-ls/pull/40#issuecomment-589199897 this issue is rather broad and would be better tackled with individual issues/PR's.

Based on that, I guess it wouldn't hurt to include a "namespace" scope bellow the meta.module.elixir. This way the current coloring will be the default if you don't specify a config for the namespace scope (it's basically the user's choice to use it or not with a sane behavior by default).

I'm open to adding a syntax definition that would assist with this as long as it isn't colored by default in most themes, although I may want a different name like module-prefix

A better example would be comparing def fun(x), do: x and [key: "hello"] == [{:key, "hello"}]. Currently is not possible to highlight only atoms. In the previous example, both do: and key: or :key would be colored the same because they are both a constant.other.symbol.elixir. So I guess having this definition for atoms would be nice to optionally extend a color scheme and create this distinction (even though in your example they are conceptually the same).

do: and :key are both atoms, since def is just a macro and actually uses keyword list syntax under the hood: https://elixir-lang.org/getting-started/case-cond-and-if.html#doend-blocks

Potentially an additional scope could be added for do:, but by default it should be treated as an atom.

thiagomajesk commented 4 years ago

Thanks, guys for everything so far, https://github.com/elixir-lsp/vscode-elixir-ls/issues/42 and https://github.com/elixir-lsp/vscode-elixir-ls/issues/43 were created to track the remaining work as we discussed