elixir-lang / elixir

Elixir is a dynamic, functional language for building scalable and maintainable applications
https://elixir-lang.org/
Apache License 2.0
24.31k stars 3.35k forks source link

Invalid formatting of :: op as keyword #8832

Closed gomoripeti closed 5 years ago

gomoripeti commented 5 years ago

Environment

Current behavior

If a keyword list contains a key with the atom consisting of two colons (with the sugar syntax) the formatter emits non-compilable code.

Example in the shell

iex(2)> kwlist = ["::": 2]
[::: 2]
iex(3)> kwlist |> Macro.escape() |> Macro.to_string() |> Code.string_to_quoted()
{:error, {1, "syntax error before: ", "\"2\""}}

The quotes are removed when formatting the key which is ambiguous. (note that quotes are only removed in case of 2 colons, because it is an operator)

iex(19)> kwlist = for i <- 0..4, do: {List.duplicate(?:, i)|>List.to_atom, i}
["": 0, ":": 1, ::: 2, ":::": 3, "::::": 4]

The formatter also behaves the same way ie it removes the quotes.

This is not a practical issue, only breaks the guarantee that any random program that can be compiled should also be compilable after auto-formatting.

Expected behavior

Formatter should not break comilability.

taiansu commented 5 years ago

I spend some time on this issue and realize that output [::: 2] is correct because :: is a binary operator, just like [|>: 3]. However, the problem is that Code.string_to_quote/1 shouldn't return an error while parsing "[::: 3]".

Currently, I'm reading elixir.erl's eval/2 and see if I can get it solved.

fertapric commented 5 years ago

@taiansu I think the error might be related to the tokenizer:

iex(4)> :elixir_tokenizer.tokenize('[::: 2]', 1, [])
{:ok,
 [
   {:"[", {1, 1, nil}},
   {:atom, {1, 2, nil}, :::},
   {:int, {1, 6, 2}, '2'},
   {:"]", {1, 7, nil}}
 ]}
iex(5)> :elixir_tokenizer.tokenize('[test: 2]', 1, [])
{:ok,
 [
   {:"[", {1, 1, nil}},
   {:kw_identifier, {1, 2, nil}, :test},
   {:int, {1, 8, 2}, '2'},
   {:"]", {1, 9, nil}}
 ]}

but I'm not sure if it can be solved there.

josevalim commented 5 years ago

Correct. ::: is inherently ambiguous. The fix is most likely to wrap :: in quotes whenever it is used as an atom or keyword in Code.Identifier. It is the same as .. that is always wrapped in quotes when used as a function call due to the ambiguity with ... --

José Valim www.plataformatec.com.br Skype: jv.ptec Founder and Director of R&D

josevalim commented 5 years ago

Thanks @taiansu and @fertapric for looking into this! Based on your feedback, I decided to remove the ambiguity by always quoting, here is the PR: #8890. We can continue the discussion there!