abathur / binlore

MIT License
8 stars 2 forks source link

binlore - generate and aggregate info about executables in nix packages

Since binlore is very young and currently has a limited scope, the vision may make a little more sense if I outline what it currently does and why. If you'd like to help improve binlore, see How to help

Why / Motive

I'm building binlore to help resholve decide how likely the executables it finds in shell scripts are to also execute one of their arguments.

This information helps resholve scrutinize these invocations more carefully and require user triage as-needed (without wasting user time on unlikely cases).

What / API (high level)

binlore itself is a Nix API with three main functions:

Trying out the API

If you want to contribute to binlore or use binlore in your own project, then you’ll probably want to see what lore binlore produces for particular commands. Here’s how you would get the lore for hello and haskellPackages.hello:

$ git clone https://github.com/abathur/binlore.git
$ cd binlore
$ nix-build -E 'with import <nixpkgs> { }; (callPackage ./binlore.nix { binloreSrc = ./.; }).collect { drvs = [ hello haskellPackages.hello ]; }'
/nix/store/...-more-binlore
$ cat result/execers
cannot:/nix/store/...-hello-2.12.1/bin/hello
cannot:/nix/store/...-hello-1.0.0.2/bin/hello
$ cat result/wrappers
$

In this example, result/wrappers is empty.

How / Analyses

The only [analysis] so far meets resholve's immediate needs. You can find its definition in the loreDev attr in default.nix, but the broad strokes are that it:

Trying out the analyses (low-level)

Sometimes it's enough to see the high-level lore produced by the collect function for a specific package or executable, and other times you'll need to pop open the hood to understand how binlore's YARA rules are leading to a specific result. (Perhaps because it's wrong, or perhaps because binlore just doesn't have rules for a specific language or binary format.)

With the traditional nix commands, you can do something like:

$ git clone https://github.com/abathur/binlore.git
$ cd binlore
$ nix-shell
$ binlore_yara /nix/store/...-diffutils-3.10
...
executable /nix/store/...-diffutils-3.10/bin/cmp
macho_binary /nix/store/...-diffutils-3.10/bin/cmp
binary /nix/store/...-diffutils-3.10/bin/cmp
macho_cannot_exec /nix/store/...-diffutils-3.10/bin/cmp
decidable /nix/store/...-diffutils-3.10/bin/cmp
cannot_exec /nix/store/...-diffutils-3.10/bin/cmp
executable /nix/store/...-diffutils-3.10/bin/diff
macho_binary /nix/store/...-diffutils-3.10/bin/diff
binary /nix/store/...-diffutils-3.10/bin/diff
macho_execve /nix/store/...-diffutils-3.10/bin/diff
execve /nix/store/...-diffutils-3.10/bin/diff
decidable /nix/store/...-diffutils-3.10/bin/diff
can_exec /nix/store/...-diffutils-3.10/bin/diff

Using the experimental CLI you can do something like:

$ nix develop github:abathur/binlore
$ binlore_yara /nix/store/...-diffutils-3.10
...
executable /nix/store/...-diffutils-3.10/bin/cmp
macho_binary /nix/store/...-diffutils-3.10/bin/cmp
binary /nix/store/...-diffutils-3.10/bin/cmp
macho_cannot_exec /nix/store/...-diffutils-3.10/bin/cmp
decidable /nix/store/...-diffutils-3.10/bin/cmp
cannot_exec /nix/store/...-diffutils-3.10/bin/cmp
executable /nix/store/...-diffutils-3.10/bin/diff
macho_binary /nix/store/...-diffutils-3.10/bin/diff
binary /nix/store/...-diffutils-3.10/bin/diff
macho_execve /nix/store/...-diffutils-3.10/bin/diff
execve /nix/store/...-diffutils-3.10/bin/diff
decidable /nix/store/...-diffutils-3.10/bin/diff
can_exec /nix/store/...-diffutils-3.10/bin/diff

Each line here indicates that a YARA rule of the same name (currently in execers.yar) matched for that path.

Note: You may also want to fork this repo, add the relevant package to big.nix (if it isn't already there), and push it up to github to run the CI process. The CI job will dump this information (and some additional analysis) for all included packages.

Meta

I'm not sure what binlore's long-term relationship to individual analyses will be (or that the current abstractions are right).

Lore Formats

There are currently two "kinds" of lore. In both cases below, the field separator is equivalent to FIELD_SEPARATOR=$':':

Usage / Examples

For now, at least, resholve's own uses of binlore should serve as a good example of the rough intent and usage: