gwk / muck

A build tool for data projects.
49 stars 3 forks source link

Muck

Muck is a build tool for data analysis projects. Given a target (a file to be built), it looks in the current directory tree for a source file with a matching name, determines its static dependencies, recursively builds those, and then builds the target, possibly building additional dynamic dependencies as necessary. For example, if we ask Muck to build some.txt, it will run any source file with some.txt as a prefix, e.g. some.txt.py, some.txt.sh, or some.txt.md (there must be a single source candidate). If some.txt.py opens data.txt, Muck will suspend the execution of the process and update data.txt.

Unlike Make and other traditional build systems, Muck does not use a "makefile". Instead, Muck determines the dependencies of a given file using static analysis and runtime interposition of the Unix open system call. With Muck, programmers can organize projects into discrete steps with arbitrary file dependencies between them. When the source code for a particular step changes, Muck will rebuild that step and all dependent ("downstream") steps, but will not redo any work that is not affected by the change. This incremental rebuild behavior speeds up the development process and helps prevent errors due to stale product files.

Muck is most useful for projects where the various products can be given descriptive, discrete names. It is less useful for problems that can be framed as processing a continuous stream of inputs; these are better served by an application server.

Getting Started

Muck is a work in progress. I encourage people to try it out, with the caveat that it is not yet entirely stable. If you run into issues, I am more than happy to help you work through them. The project is hosted at https://github.com/gwk/muck, with documentation at https://gwk.github.io/muck. To get started, read the "Installation" section.

License

All of the source code and documentation is dedicated to the public domain under CC0: https://creativecommons.org/.publicdomain/zero/1.0/.

Status

Muck is still in development. Currently it only runs on Mac OS, but Linux support is coming soon. It has been used for a variety of experimental projects, but more work is needed to make it production-ready. In particular, Linux support has recently fallen behind, and the test suite and documentation need improvement.

Issues

Please file any bugs, questions, or comments at https://github.com/gwk/muck/issues.