elixir-lang / elixir

Elixir is a dynamic, functional language for building scalable and maintainable applications
https://elixir-lang.org/
Apache License 2.0
24.56k stars 3.38k forks source link

Code.Fragment.container_cursor_to_quoted does not handle stab well #13826

Open lukaszsamson opened 2 months ago

lukaszsamson commented 2 months ago

Elixir and Erlang/OTP versions

Erlang/OTP 26 [erts-14.2.5.2] [source] [64-bit] [smp:12:12] [ds:12:12:10] [async-threads:1] [jit]

Elixir 1.17.2 (compiled with Erlang/OTP 26)

Operating system

any

Current behavior

In cases where relevant information is after the operator the returned AST is suboptimal. An example (with cursor marked as |)

case foo do
  %{some: some} -> do_sth(some)
  %{other: other} = |foo -> {:error, other}
end

AST returned

{:case, [line: 1],
  [
    {:foo, [line: 1], nil},
    [
      do: [
        {:->, [line: 1],
         [
           [{:%{}, [line: 1], [some: {:some, [line: 1], nil}]}],
           {:__block__, [],
            [
              {:do_sth, [line: 1], [{:some, [line: 1], nil}]},
              {:=, [line: 1],
               [
                 {:%{}, [line: 1], [other: {:other, [line: 1], nil}]},
                 {:__cursor__, [line: 1], []}
               ]}
            ]}
         ]}
      ]
    ]
  ]}

as if the code was

case foo do
  %{some: some} ->
    do_sth(some)
    %{other: other} = |foo
end

This means the cursor is on the wrong case branch and in wrong context.

I understand that the behaviour is a consequence of tokenizer dropping all code after the cursor and in this case the result is ambiguous.

Expected behavior

Some ideas.

  1. Use indent as to decide which AST representation is more likely
  2. Provide some a new API that does not drop valid code after cursor
sabiwara commented 1 week ago

Summarizing the approach discussed with @josevalim (sorry, rough notes):

Will got with 2. (Provide some a new API that does not drop valid code after cursor), with an API taking an extra cursor position.

Steps:

  1. container_cursor_to_quoted/3 which splits the input at line+column
  2. call the tokenizer with the rest of the input with check_terminators: false, and match do/end+fn/end pairs until there is a -> or an dangling end
  3. optimize step 2
  4. change the tokenizer to return a "prefix" in case cursor_completion: true and we stop in the middle of a string/sigil/heredoc, and append the prefix when doing step 2/3
  5. Remove this heuristic from container_cursor_to_quoted/2