Closed gausby closed 3 years ago
Ouch :)
This should be no problem, if the syntax requires do:
we can decide on the :
if there is an associated end or not.
I think ignoring do:
's on pairs would suffice :)
There are also some problems with matching defp
's. Could it be because of the order in this list: https://github.com/Fuco1/smartparens/blob/master/smartparens-elixir.el#L39-L52 ?—and should I open another issue?
Also…as I read the code: it will look for a do
, go to the start of the line and look for one of the keywords.
Unfortunately, if this is the case: that assumption does not suffice: The do
doesn't have to be on the same line as the keyword. An example could be this function defintion where I use guards:
defp drop_tailing_bits(%__MODULE__{size: size} = bitfield)
when rem(size, 8) == 0 do
{:ok, bitfield}
end
Am I correct in my understanding of the code, and should I open another issue for this one?
Yep, it works as you say. I guess we don't need another issue, maybe you could open a PR with added failing tests displaying these cases so I can then fix the issues. What do you think?
I will look into it asap, but I am kinda busy until Sunday.
Thanks for a great package, btw :)
@Fuco1 I'm fairly new to emacs and elisp, so I'm sorry if the question sounds dumb, but... wouldn't something like this solve the issue?
here: https://github.com/Fuco1/smartparens/blob/master/smartparens-elixir.el#L41 like this:
...
(when (equal (rx "do[^:]") id)
...
Are there any issues using a regex in this place?
I don't remember what the smartparens code does there, but we need to make sure this do:
construct doesn't expect a corresponding end
. Which the regular do
does expect.
It might still work though. If you use elixir you can test it and if it works we can create a patch.
Hi! This error still pops up in latest version of elixir-mode. Or at least, that's what I am experimenting. Is the issue supposed to be fixed already? (and thanks a lot for your work! not sure I know enough about Emacs Lisp to help but still...)
Hi! Here is another example where similar errors occur in Elixir: with the quote do
construct.
If someone is ready to provide some initial guidance, I'd be happy to help.
The quote do
should already be handled by sp-elixir-def-p
, so if that doesn't work there's likely a bug in that code.
As for the inline do:
, can you provide some minimal valid elixir code where this causes a problem? I do not use elixir or ruby very much.
@Fuco1 Here is a minimal example with do:
defmodule ElixirExampleOne do
def func_one do
IO.puts("hi 1")
end
def func_two, do: IO.puts("hi 2")
end
The def
for func_two
ends up paired with the end
on the last line which is incorrect.
Another related bit of syntax that is interpreted incorrectly is anonymous functions, in this example func_two
is a function that returns an anonymous function. The end
inside func_two
should be paired with fn
, but instead it is paired with the def
of func_two
:
defmodule ElixirExampleTwo do
def func_one do
IO.puts("hi 1")
end
def func_two do
fn name -> IO.puts("hello " <> name) end
end
end
The fn ... end
could be solved separately or in a different PR, but I mention it since it is another incorrect pairing and it might be easier to solve both of them at the same time rather than one after another (I'm not familiar enough with the code to really know).
You can streamline the examples even further by removing func_one
from each of them.
@Fuco1 I'm interested in working on this, as I experience this problem and it makes it a lot harder to navigate code. Any points towards where I should look at? E.g. Docs or maybe there are already languages supported with similar feature?
@axelson, sorry to disturb you, but it seems that I've fixed the do:
issue, and fn ... end
is already handled by Smartparens. However I would like to ask if often see linebreaks before do:
or even do
forms in the code, e.g.:
defmodule ElixirExampleOne do
def func_one
do
IO.puts("hi 1")
end
def func_two,
do: IO.puts("hi 2")
end
Because currently Smartparens is completely broken for this particular case and my PR does not yet fix this. The question is if this needs to be fixed, although I suppose this will be a llittle more performance hungry.
@andreyorst Thanks for taking a stab at this issue! Those two forms are indeed valid elixir code, although the first one is not seen very often, usually there is a when
clause as a guard before the do
. Here's a couple longer examples:
defmodule ElixirExampleTwo do
def can_destroy_message(nil, _), do: {:error, %{reason: "not_found"}}
def can_destroy_message(%{destroyed_at: destroyed_at}, _)
when not is_nil(destroyed_at),
do: {:error, %{reason: "not_found"}}
def can_destroy_message(%{room_id: message_room_id}, %{room_id: member_room_id})
when message_room_id != member_room_id,
do: {:error, %{reason: "unauthorized"}}
def can_destroy_message(%{content: content} = message, room_member)
when not is_nil(content) and not is_nil(room_member) do
# implementation
end
end
The def
being on a separate line from the do
is rather common in Elixir code, so it would be great if it was well-handled.
Good news, I've managed to make this case work last night:
defmodule ElixirExampleTwo do
def can_destroy_message(nil, _),
do: {:error, %{reason: "not_found"}}
def can_destroy_message(%{destroyed_at: destroyed_at}, _)
when not is_nil(destroyed_at),
do: {:error, %{reason: "not_found"}}
def can_destroy_message(%{room_id: message_room_id}, %{room_id: member_room_id})
when message_room_id != member_room_id,
do: {:error, %{reason: "unauthorized"}}
def can_destroy_message(%{content: content} = message, room_member)
when not is_nil(content) and not is_nil(room_member) do
# implementation
end
end
However your example doesn't quite work. The difference is in the newline before first do:
. I'm not sure if I can fix this.
@axelson if you can test implementation from #1058 on some more or less big codebases (I don't write in Elixir and don't know big projects except maybe the language server), and provide some more examples where it breaks it would be very helpful :)
Yay, I've fixed that bug with do:
not being parsed when it is on the same line! Now all code I've seen so far is parsed correctly. Relevant changes are in #1058
Great! I'm having some trouble getting the git version of smartparens installed locally but I should be able to figure it out eventually.
Great! I'm having some trouble getting the git version of smartparens installed locally but I should be able to figure it out eventually.
If you use straight
and use-package
it's easy:
(use-package smartparens
:straight (:host github :repo "andreyorst/smartparens" :branch "elixir-better-search"))
If not, I guess you can simply replace smartparens-elixir.el with code from my repo https://raw.githubusercontent.com/andreyorst/smartparens/elixir-better-search/smartparens-elixir.el and bytecompile it
I guess that the only quirk left out now is bodyless clauses. I'll try to handle it as well.
Also, I've noticed that without (modify-syntax-entry ?_ "w")
Smartparens will think that end_
is also a valid pair. @Fuco1 is this intentional?
Uh oh. I've also noticed that this is a valid syntax:
defmodule Example do
def foo(do: body) do
body
end
end
Example.foo do
IO.puts("1")
IO.puts("2")
end
Which currently breaks all heuristics I've introduced...
And given that this all can be minified to singlle line like this
defmodule Example do def foo(do: body) do body end end; Example.foo do IO.puts("1"); IO.puts("2") end
Making this extremely hard to find matching keywords with my approach. I guess that language with such syntax should use a proper parser instead, something like tree-sitter, to make structural navigation really reliable.
Edit: OK, I might be able to fix this, but I'm not sure if other things will not pop out
I was thinking of using something like LSP for parsing but it might be too slow for interactive navigation. Maybe fetching entire file's AST or some partial AST around point would work, but it's probably a lot of work.
Once you include some test cases for the new work I'll be happy to merge this. Alternatively, if either of you wants to maintain the elixir extension, I'll give you push access.
I was thinking of using something like LSP for parsing but it might be too slow for interactive navigation.
@Fuco1 have you heard about https://tree-sitter.github.io/tree-sitter/? It is amazingly fast and provides all necessary info for structural navigation and manipulation. Albeit it is an external library that must be loaded separately, but there are works in Emacs for this https://github.com/ubolonton/emacs-tree-sitter.
Once you include some test cases for the new work I'll be happy to merge this. Alternatively, if either of you wants to maintain the elixir extension, I'll give you push access.
I don't think I'll go as far as Elixir extension maintainer, as I don't program in it :) At least rightnow. I've just decided to help by fixing the issue, and maybe some others that I have not yet looked at.
I am a maintainer of the Elixir Language Server, so I'm curious how we'd potentially leverage the language server from smartparens. Are there any examples from other languages? Also I'm not currently well-versed enough in elisp to feel comfortable writing elisp for a use-case like this.
@andreyorst with the ElixirExampleTwo
above and your code (546274a) I'm not seeing the first do
for the overall module matching with the end
on the very last line. Hmm, actually it looks like if I hover over the defmodule
then the final end
is highlighted, but if I hover over the final end
then the defmodule
is not highlighted.
Also, I'm very curious about TreeSitter and would like to investigate that more in the future, although last time I checked it didn't seem quite ready to use, but perhaps that's changed by now.
if I hover over the final end then the defmodule is not highlighted.
Here's a really good talk on Tree Sitter: https://www.youtube.com/watch?v=Jes3bD6P0To, and at 5:03 Tree Sitter's author explains why LSP should not be used for such features as syntax highlighting and structural navigation. In short - embedding several language servers into smartparens will almost turn smartparens into another LSP client, so perhaps, if a server provides such features as strucural navigation (I don't know any server that does that), it is better to make smartparens work with LSP-mode or eglot instead.
although last time I checked it didn't seem quite ready to use, but perhaps that's changed by now.
If you mean in Emacs, than maybe yeah. In Atom it is fairly useful since around 2018. Although support for new languages comes at a quite slow pace, but I guess that's because VSCode pushes LSP for highlighting syntax.
if I hover over the final end then the defmodule is not highlighted.
Hm, that's weird. Here's how it works for me
Here's a really good talk on Tree Sitter: https://www.youtube.com/watch?v=Jes3bD6P0To, and at 5:03 Tree Sitter's author explains why LSP should not be used for such features as syntax highlighting and structural navigation. In short - embedding several language servers into smartparens will almost turn smartparens into another LSP client, so perhaps, if a server provides such features as strucural navigation (I don't know any server that does that), it is better to make smartparens work with LSP-mode or eglot instead.
The video you mentioned explains that tree-sitter is more adapted for Atom and GitHub because they deal with plenty of languages. As of smartparens, it's the same thing. But the question is: do all editors have an equivalent of smartparens? Besides, isn't it the point of an LSP to decouple IDE-capibilities from specific frontend? Finally, is providing excellent syntax highlighting without any mistakes for all the possible grammars really feasible without some errors here and there?
LSP specification now supports semantic tokens: https://microsoft.github.io/language-server-protocol/specifications/specification-3-16/#textDocument_semanticTokens
The LSP Server maintainers can implement it using tree-sitter or any other parser they prefer, and support syntax highlighting from there.
However - I do agree that syntax highlighting is a little special, because it's something you always want to work, and language server means I/O, which means errors. Nothing prevent lsp clients to keep in cache the latest syntax information though. Besides, it's better if syntax highlighting work similarly between multiple languages: it's confusing to have different theme for multiple languages so I am a little conflicted here. This is easier to deal with if you have one client-side project - like smartparens - to deal with syntax highlighting for all the languages.
At the end of the day, tree-sitter is just a parser, and LSP is just a protocol. Whether you use tree-sitter from LSP or directly on the client side is a matter of choice, but they are in no way incompatible.
if I hover over the final end then the defmodule is not highlighted.
Hm, that's weird. Here's how it works for me
Probably something specific to my setup (and likely related to something that Doom is adding). I haven't noticed any other issues so :+1: from me on merging the PR.
But the question is: do all editors have an equivalent of smartparens?
@nicobao Didn't seen that coming, but yeah, actually a lot of text editors have smartparens-like plugins. Don't really understand why this is relevant though.
is providing excellent syntax highlighting without any mistakes for all the possible grammars really feasible without some errors here and there?
Tree sitter was implemented with this in mind - to be able to parse correctly as many languages as possible, with a lot of ambiguity and a way to recover from error state while still providing useful syntax tree. This is mostly what Smartparens does, except it is much simpler than tree-sitter as it features small regexp based parser for each language it supports. And as we see in this issue, some languages may require a more complete parsing to work reliably.
Also, perhaps this was an analogy, but just to make sure - Smartparens does not provide syntax highlighting at all. Using syntax highlighting info from a language server just for querying the syntax tree seems overkill to me. Treesitter on the other hand provides just the syntax tree that can be analysed however you want to.
Anyway, I think this gets away from the main topic.
@axelson, @Fuco1 I was able to implement searchers that work for minified version of the code, where everything is in single line, but this actually makes parsing very slow. For files that contain about 100 files the delay between highlighing and jumping at the top level form is significant. I guess we can keep the line based search, that can't detect everything but still a bit better than what we have ATM?
@axelson, @Fuco1 I was able to implement searchers that work for minified version of the code, where everything is in single line, but this actually makes parsing very slow. For files that contain about 100 files the delay between highlighing and jumping at the top level form is significant. I guess we can keep the line based search, that can't detect everything but still a bit better than what we have ATM?
:+1: from me on that approach. I don't think that keeping everything in a single line is very common in elixir (and most people use the formatter which wraps at 100 characters).
@Fuco1 I've added tests, and worked out most edge cases I could reliably do. I consider this as the point of as far as I can go right now.
The Elixir syntax is a bit unforgiving because there are a couple of shorthands which currently confuses smartparens 'show pairs' functionality, such as:
Do-blocks can be written as
do (expressions) end
and, do: expression
.I don't know if there are a way to support this in the show-parens mode.