doorgan / sourceror

Utilities to manipulate Elixir source code
Apache License 2.0
326 stars 22 forks source link

Proposal: `Zipper.at(ast, position)` #155

Closed zachallaun closed 3 months ago

zachallaun commented 4 months ago

For language servers, a fundamental operation is "get the innermost node at the cursor". Once you're at that node, it is often useful/necessary to navigate around a bit, potentially inspecting siblings, parents, etc. Zippers are optimal for this, but there is a performance overhead to using Zippers when drilling down to the innermost node.

An option I thought of is to introduce a Zipper.at/2 that, given an AST and a position, returns a zipper focused on the innermost node at that position. This can be done by recursively drilling down to the innermost node whose range contains the given position, while keeping track of a stack of {parent, left_siblings, right_siblings}, and then constructing a Zipper in one shot.

zachdaniel commented 4 months ago

The only thing I'd note here is that this would likely be incompatible with a modified zipper. I don't think that's a problem but should perhaps be documented clearly.

zachallaun commented 4 months ago

The only thing I'd note here is that this would likely be incompatible with a modified zipper.

I'm not sure I follow; the idea here is that this would be used as an alternative to Zipper.zip/1, so it's what you'd use to construct the zipper in the first place. Once you have it, you can modify till your heart's content, go to the root, and get the now-modified AST.

zachallaun commented 4 months ago

Maybe I do understand: You can't modify a zipper, get the root node, and then "jump" to another position to get a new zipper because the modified nodes would not have enough information to accurately get ranges. Agreed that it should be documented as a function to construct your zipper from original AST, not modified AST.

zachallaun commented 4 months ago

Sketch of what this could look like (written on my phone, almost certainly contains errors):

@spec at(Macro.t(), {line, column}) :: {:ok, t} | :error when line: pos_integer(), column: pos_integer()
def at(ast, position) do
  with {:ok, path} <- fetch_path_to(ast, position) do
    {:ok, new_from_path(path)}
  end
end

defp new_from_path([]), do: nil

defp new_from_path([{node, left, right} | ancestors]) do
  %Z{
    node: node,
    path: %{
      left: left,
      right: right,
      parent: new_from_path(ancestors)
    }
  }
end

defp fetch_path_to(ast, position) do
  if node_contains?(ast, position) do
    {:ok, path_to(position, [{ast, [], []}])}
  else
    :error
  end
end

defp path_to(position, [{parent, _parent_left, _parent_right} | _] = path) do
  {left, node_and_right} =
    parent
    |> children()
    |> Enum.split_while(fn child ->
      not node_contains?(child, position)
    end)

  case node_and_right do
    [] ->
      path

    [node | right] ->
      path_to(position, [{node, left, right} | path])
  end
end

defp node_contains?(ast, position) do
  case Range.get_range(ast) do
    nil -> false
    range -> exclusive_range_contains?(range, position)
  end
end

defp exclusive_range_contains?(range, pos) do
  range_start = {range.start[:line], range.start[:column]}
  range_end = {range.end[:line], range.end[:column]}

  pos >= range_start and pos < range_end
end
zachallaun commented 4 months ago

Proposal update: for consistency with the rest of Sourceror, the position should be passed in as a [line: line, column: column] kw list instead of a {line, column} tuple.

doorgan commented 4 months ago

This would be a welcome change!

Proposal update: for consistency with the rest of Sourceror, the position should be passed in as a [line: line, column: column] kw list instead of a {line, column} tuple.

This was my first thought, let's do it this way 👍

returns a zipper focused on the innermost node at that position.

This sounds like a good approach, I've thought of adding a function like this but taking a range instead of a position and there's a lot of questions that arise in that case(since you can have ranges that match the tree in disparate ways. Matching on a single position seems easier to implement