lexical-lsp / lexical

Lexical is a next-generation elixir language server
874 stars 80 forks source link

Don't run top-level code in `.exs` files #798

Closed zachallaun closed 2 months ago

zachallaun commented 2 months ago

Lexical's as-you-type compilation and error reporting works by parsing the AST after a small period of quiescence and then compiling it with Code.compile_quoted/2. This runs top-level forms, however, so a .exs script that performs side-effects when run, e.g. creating or deleting a file, would have them run repeatedly unless you wrap them in a def inside a module.

The goal, then, becomes reporting on as many errors/warnings as possible without running top-level forms.

The current approach here is to chunk all top-level forms into "safe" and "unsafe", where safe forms are only defmodule/use/import/require/alias. Safe forms stay at the top level, while unsafe forms are wrapped to prevent their execution. Concretely, this script:

# my_script.exs

import Foo

call_1()

defmodule Bar do
  ...
end

call_2()
call_3()

would be transformed to this after parsing, but prior to compilation:

# my_script.exs

import Foo

defmodule :lexical_wrapper_0 do
  def __lexical_wrapper__ do
    call_1()
  end
end

defmodule Bar do
  ...
end

defmodule :lexical_wrapper_2 do
  def __lexical_wrapper__ do
    call_2()
    call_3()
  end
end

Issue 1: Incorrect unused/undefined var warnings

This isn't quite good enough, though. It means that if you, for instance, set some constants in your script at the top, you'd get warnings for any usage below if they're separated by a "safe" form. For instance:

dir = "..."
import Something
call(dir)

# would transform into

defmodule :lexical_wrapper_0 do
  def __lexical_wrapper__ do
    dir = "..."
  end
end

import Something

defmodule :lexical_wrapper_2 do
  def __lexical_wrapper__ do
    call(dir)
  end
end

This would result in a warning where dir is assigned (unused variable) and an error where dir is used (undefined variable).

Potential solution: collect vars and references in each chunk of top-level code, replacing unused vars with _var. Then, for any later chunks, use those vars to build "stub parameters" in the wrapper def to avoid undefined variable errors.

defmodule :lexical_wrapper_0 do
  def __lexical_wrapper__([]) do
    _dir = "..."
  end
end

import Something

defmodule :lexical_wrapper_2 do
  def __lexical_wrapper__([dir]) do
    call(dir)
  end
end

Issue 2: Using top-level var as a module name

The approach above will break this code:

module_name = My.Module
defmodule module_name do
  :ok
end

I'm personally okay with this as a "known defect" that perhaps could be tackled later on, though I'm not sure a perfect solution exists.

Things this explicitly doesn't address

zachallaun commented 2 months ago

Few TODOs on this:

Blond11516 commented 2 months ago

Is use really safe? It's probably uncommon but the underlying __using__ macro could very well have side effects.

zachallaun commented 2 months ago

Is use really safe? It's probably uncommon but the underlying __using__ macro could very well have side effects.

I considered that. I think it's extremely uncommon for a top-level use to have anything resembling dangerous side effects. Much more common, and the case I'd like to support if possible, is "bag of aliases/imports" where use Something is a sort of short-hand for importing Something.Foo, Something.Bar, aliasing Something.Baz, etc.

In the same vein, defmodule isn't really safe, since top-level forms inside it are also evaluated. So if you put silly stuff in the top level of your modules, silly stuff will happen when you compile it.

zachallaun commented 2 months ago

I went ahead and implemented the approach I suggested in the PR to fix incorrect unused/undefined variable warnings. Unfortunately, it makes the code rather more complicated, but I've added some unit tests for the stickier bits that will hopefully also help the next person who works on this understand what's going on.

Still very open to other approaches, but I couldn't think of any.

scohen commented 2 months ago

Re: issue 2, I think the use case that won't work is exceedingly rare, and I'm fine not supporting it.

zachallaun commented 2 months ago

Re: issue 2, I think the use case that won't work is exceedingly rare, and I'm fine not supporting it.

@scohen 👍 I'll cross that off the list.

The only thing to decide on now is whether to exclude tests.

My opinion is that we shouldn't. I think the best argument for excluding them from the ".exs treatment" is that the treatment might be buggy, but:

  1. It seems okay from my usage
  2. Since tests are by far the most common .exs file, the code will be exercised a lot and bugs are likely to be noticed quickly
  3. If there ends up being something fundamentally worse about it, we can easily exclude tests later
scohen commented 2 months ago

Agree, let's give it a shot. This makes working with scripts possible. It's so funny that I was literally working on benchmarks and becoming frustrated with lexical being "slow", when it was just running the benchmark.

scohen commented 2 months ago

@zachallaun is this ready to merge?