Synthetica9 / nix-linter

Linter for the Nix expression language
BSD 3-Clause "New" or "Revised" License
158 stars 16 forks source link

fix UnusedLetBind: account for `inherit` #62

Open Radvendii opened 2 years ago

Radvendii commented 2 years ago

This fixes #61 and additionally checks inherit statements for unused identifiers bound.

It changes the philosophy of checking for unused let bindings a little bit. Now it checks by removing each binding in turn and reconstructing the rest of the let statement, and then checking whether that name is free in the constructed let binding. If not, it wasn't being used.

For instance,

let
  x = 5;
  y = { z = 3;};
  inherit (y) z;
in
  x

Becomes internally

let
  y = {z = 3;};
  inherit (y) z;
in
  x

and now x is free, so we were using that binding. It also becomes

let
  x = 5;
  inherit (y) z;
in
  x

and now y is free, so we needed that binding too. Finally, it becomes

let
  x = 5;
  y = {z = 3;};
in
  x

And we discover that z is not free, so the last binding was not needed. If the inherit binds multiple values, it will check each of them separately.

Synthetica9 commented 2 years ago

How does this affect performance? Is this still able to check all of nixpkgs in a reasonable time?

Synthetica9 commented 2 years ago

Some basic testing suggests this is somewhat cpu intensive, but not extremely so:

Before:

________________________________________________________
Executed in  117.01 secs    fish           external
   usr time  285.60 secs    0.17 millis  285.60 secs
   sys time   55.60 secs    1.04 millis   55.60 secs

After:

Executed in  173.32 secs    fish           external
   usr time  374.18 secs    0.71 millis  374.18 secs
   sys time   87.61 secs    1.13 millis   87.61 secs
Radvendii commented 2 years ago

Ah I see. I have a few ideas for considerably speeding up the execution.

  1. We could copy the code from hnix that calculates free variables, and modify it to ignore anything that gets bound in the first let statement. This would, in one go, give us all the "free" variable, and we can compare that to the set of variables bound in the let statement. Pros:

    • Elegant
    • Cuts down significantly on redundant computation (goes from quadratic to linear as a function of the number of bindings)

    Cons:

    • Requires duplicating a big function from hnix, which could get out of sync and is generally kind of a pain
    1. We could evaluate the body of the let statement (everything after in) only once for free variables, and keep them in a list. Then put a dummy in there afterward (like 5). The let bindings themselves would be evaluated the same way they are right now. Pros:
      • Somewhat faster (maybe significantly faster, depending how big the body of the let statement is)
      • more similar to what we have right now; doesn't require duplicating lots of code from hnix.

    Cons:

    • Kind of inelegant and confusing. It's clearly not the simplest or the fastest approach, so it might not be clear why we're doing it (could be mitigated by a comment)

Up to you which you prefer

Radvendii commented 2 years ago

One thing that would help me is understanding where the function freeVars comes from. It's not defined anywhere in hnix, but there's a function that seems to be doing the same thing called getFreeVars. Was it renamed?

EDIT: I remembered that git blame exists and indeed it's been renamed

Synthetica9 commented 2 years ago

One thing that would help me is understanding where the function freeVars comes from. It's not defined anywhere in hnix, but there's a function that seems to be doing the same thing called getFreeVars. Was it renamed?

EDIT: I remembered that git blame exists and indeed it's been renamed

I think I copied that from hnix back when they didn't export it (around hnix 0.4 or so). It would probably be better to use the upstream version if they export it, and do any bugfixes there instead of copying it.

Radvendii commented 2 years ago

Alright, let me know what you think of this solution. It doesn't copy over all the code, just the stuff that's needed to make the function we want.

Also let me know if it actually speeds anything up. I'm getting wildly inconsistent speeds from one run to the next, so I can't easily compare.