ndmitchell / shake

Shake build system
http://shakebuild.com
Other
771 stars 118 forks source link

idea for lint mode to detect files that were used but not "needed" #660

Open bitc opened 5 years ago

bitc commented 5 years ago

Hi, here's an idea that me and a friend had.

The goal is to check all the build rules inside your ShakeFile.hs to see if you forgot a necessary need (meaning that one of the commands that was executed read a certain file, but the build rule never called need on this file)

The initial idea was a special lint mode that would run like this: before running each build rule, set the file permissions of all of the files to 000. Then, during the execution of the build rule, when need is called, the file is "unlocked" and its permissions are restored. This way, if a command tries to read a file that wasn't declared as a dependency then it will fail with a "permission denied" error when it tries to open the file. If this happens then we know that our build rule code has a bug: we forgot a necessary need (which we can now add).

This lint mode would need to run in single-threaded mode, so that parallel build rules don't interfere with each other.

But this idea has an issue with (correct) build rules that call need on files only after they have already been read (such as gcc -M). To deal with this, I believe the solution is for this lint mode to work like this:

  1. Take a snapshot of all of the files in the project (cp -r . /tmp/snapshot)
  2. Run a regular build (might be allowed to be parallel). All of the dependencies of all the rules are tracked, and stored in the shake database.
  3. Restore the snapshot (rm -r .; cp -r /tmp/snapshot/* .). Make sure though that the shake database is maintained.
  4. Now run the build again, in single threaded mode. Before each build rule is executed, set the permissions of all the files to 000, and then restore the permissions of all the files that were declared as a dependency of this build rule during step 2. (It might be a bit more involved than this, to deal with rule suspend + resume, but it should be workable). If any of the commands try to read a file that wasn't declared as a dependency during the previous run, then we will get a "permission denied" error and now we know that we found a bug in our build script.

Any thoughts?

ndmitchell commented 5 years ago

Interesting idea - certainly could work. The limitations are that you are restricted to single threading (so it's unlikely people would run this on by default). However, other than that, it could be quite effective. You also have the issue that if you run in certain orders, e.g. run rule foo (which declares a dependency on bar), then baz (which doesn't declare bar but uses it) you wouldn't detect all the potential problems.

However, maybe these features are close enough that this permissions approach is not required:

Together the first traces file access, and thus approximates telling you want was need'd, the second is approximately copying snapshots around, and so detects a lot of issues that snapshots catch. Have you tried either of those? They are both pretty new.