jtojnar / nixpkgs-hammering

Beat your package expressions into a shape
MIT License
253 stars 15 forks source link

Add unused-argument check #32

Closed rmcgibbo closed 3 years ago

rmcgibbo commented 3 years ago

Still some work to be done, but I wanted to file the WIP before signing off for the night.

  1. ~Doesn't handle func = a: 1;-type functions yet, only handles functions where the formal parameters are in a pattern struct (obviously needs to be fixed).~
  2. Not every possible unused argument is actually caught -- a construct {foo}: {foo = 1;} will not show as having an unused argument even though it does, because the keyvalue key foo in the function body will be incorrectly identified as a use of the formal parameter. This might not actually need to be fixed.
  3. Other scoping like with and let blocks that introduce new variables and cause variables in the outer scope to be shadowed aren't taken into account, so that's another source of false negatives.

But re (2,3), I don't think false negatives are that bad here. False positives would be the problem.

rmcgibbo commented 3 years ago

In terms of the infrastructure / abstractions, the refactoring out of common code in #30 works really nicely.

rmcgibbo commented 3 years ago

I ran this on all 3,693 default pkgs/development/python-modules/*/default.nix files in my checkout and it doesn't crash, so that's good. Here's an example of an argument that gets flagged as unused: https://github.com/NixOS/nixpkgs/blob/master/pkgs/development/python-modules/ihatemoney/default.nix#L40.

It's kind of spurious, since the structure of overlays requires that argument.

I'm not sure what the best way to handle that situation is. Should we encourage unused arguments to be prefixed with an underscore?

jtojnar commented 3 years ago

One issue is that nix flake check requires the arguments of overlays to be named final and prev so people will probably want to use that in Nixpkgs as well for easy copying. Similarly the function passed to overrideAttrs requires a single argument that is named attrs by convention.

So in this culture, it makes little sense to me to underscore unused arguments, especially since pattern args are much more frequent than in Haskell. Maybe the check itself should only be targeting the pattern args after all.

rmcgibbo commented 3 years ago

Good point. One idea would be to whitelist unused args if they they're substrings of any of ["final" "prev" "old" "new" "super" "self" "attrs"], but as you said applying only to pattern args might be even better.

jtojnar commented 3 years ago

I agree that we want to err on the side of false negatives. Especially since you might not even be able to determine the scope statically (not sure it is actually true, let or inherit does not allow dynamic attributes and dynamic attributes in recursive attribute sets do not seem to affect scope).

If we wanted full tracking, we would need something like the following pseudo-code:

struct Binding {
    pub name: String,
    pub used: Bool,
    pub loc: Location,
}

impl Binding {
    pub fn new(name: String, loc: Location) -> Binding {
        Binding { name = name, used = false; loc = loc; }
    }
    pub fn set_used(mut &self) {
        self.used = true;
    }
}

struct Scope {
    pub bindings: Map<String, Binding>,
    pub subscopes: Vec<Scope>,
}

impl Scope {
    pub fn new() -> Scope {
        Scope { bindings = HashMap::new(), subscopes = Vec::new(), }
    }

    fn get_last_mut(mut &self) {
        subscopes.last_mut().map_or(self, |s| s.get_last_mut())
    }

    fn get_layers_mut(mut &self) -> _ {
        std::iter::successors(self, |s| s.get_last_mut()).rev()
    }

    pub fn mark_used(mut &self, name: String) {
        for mut &scope in get_layers_mut() {
            if let mut Some(binding) = scope.bindings.entry(name) {
                binding.mark_used();
                return;
            }
        };
    }

    fn collect_bindings() -> Iter<Binding> {
        self.bindings.chain(self.subscopes.map(|s| s.collect_bindings()).flatten())
    }
}

fn build_scope(node: SyntaxNode &node, parent_scope: mut &Scope) {
    match node {
        FunctionNode(arg, body) => {
            let mut scope = Scope::new();
            match arg {
                Ident(ident) => {
                    scope.bindings.insert(ident.name, Binding::new(ident.name, ident.loc));
                }
                Pattern(attrset) => {
                    for key in attrset {
                        scope.bindings.insert(key.ident, Binding::new(key.ident, key.loc));
                    }
                }
                _ => {
                    // ?
                }
            }
            parent_scope.subscopes.push(scope);
            build_scope(body, scope);
        }
        LetBinding() => {
            // similar for let
        }
        Ident(ident) => {
            scopes.mark_used(ident.name);
        }
    }
}

fn main() {
    let mut root_scope = Scope::new();
    let root = find_root(...);
    build_scope(root, root_scope);

    assert!(root_scope.collect_bindings().filter(|b| !b.used).is_empty());
}
jtojnar commented 3 years ago

Could you please also mention the constraint (3) in the code as a comment? Otherwise looks great.