post-tangle and post-stitch actions

MyriaCore commented 4 years ago

At the moment, entangled only syncs between the literate markdown source (or documentation source, if you will), and the program source. If the user wants to have some kind of compilation going on, they're forced to use a Makefile or something to do that. However, if the user wants to use bi-directional, then the makefile would need to use inotify (as the bootstrap/cookiecutter project does), a technology not everybody knows.

Another problem this poses is that we have a bit of duplication going on - the user has to specify their targets to both the makefile, and the entangled.dhall config. This all gets very heavy very fast.

I suggest that we allow the user to specify a shell script that will be run after each stitch and tangle. I have a few design suggestions:

The environment for each script should be extended with the INPUT_FILES variable, which in the case of stitch, would be a list of markdown files, and in the case of tangle, would be a list of program source files.
The user should be able to either point to the path of a shell script, or write a shell script inline with dhall's multi-line text literals.

This would allow simple automation tasks to be easily wrapped into entangled's bi-directional pipeline. Newer users would find themselves reaching for makefiles less, which means less confusion and a better first-impression. In addition, use of the daemon would be more viable for "projectless" markdown files (see #66).

jhidding commented 4 years ago

I agree that the current situation with inotify is not ideal, especially since that only works on Linux.

I've been discussing a similar issue with @merijn; He proposed adding Shake (https://shakebuild.com/) support. Having a literal shell script is both too powerful and not powerful enough. You're shifting the load from makefiles to shell scripting, the latter being worse when it comes to building software and tracking dependencies.

There is another issue that can be solved using Shake an that is evaluating code block content. We have python pandoc filters that do this through Jupyter, but nothing is cached and this approach may not be generic enough.

We would be looking to combine Dhall configuration with Shake to specify a range of targets that should be build post-tangle (compile, evaluate) and post-stitch (pandoc).

MyriaCore commented 4 years ago

Would adding shake support necessitate having stack as a runtime dependency? I worry that we'll be leaving the non-haskell users in the dust. Shake looks cool though

jhidding commented 4 years ago

Using it as-is would add ghc as run-time dependency, which is not at all what I want. But we could create a minimal build-system on top of shake that does the things we need. This could even replace the entangled daemon entirely, since all of the daemon activities are also accessible from the command-line. I'm setting up a little experiment in entangled/milkshake to get insight into what is possible/feasible. The idea is to have a simple build system, configured with Dhall, that triggers on file system events. When that works, we can see how it integrates with Entangled concepts, e.g. self-computing scientific papers. The idea is definitely not to replace cmake.

merijn commented 4 years ago

It depends whether you plan to support "arbitary Shake code" to handle dependencies, which is more general than I was originally thinking. I was meaning to simply use Shake as a library to implement the dependency tracking, not allowing users to write arbitrary Shake code.

In that case it'd have no extra dependencies at all (besides "the shake library and a slightly fatter entangled binary"), requiring GHC as runtime dependency seems rather heavy, but maybe optionally for users that want more complicate setups.

MyriaCore commented 4 years ago

I agree with @merijn here. I think that using shake to handle dependencies is a good idea, but I think it'd be best if all we expose to the user is:

Post-tangle action, with some representation of the tangled source code
Post-stitch action, with some representation of the stitched markdown
An option to enable/disable post-tangle actions
An option to enable/disable post-stitch actions

From this, I still have 2 big design questions:

How do we represent actions?
How do we represent the tangled source code & stitched markdown

MyriaCore commented 4 years ago

WRT question 1, I do still kinda think that a shell script is still our best bet.

The reason that shell scripts are worse than makefiles when it comes to building software as @jhidding mentioned is because of the lack of dependency-tracking support. So, while I do agree on principle, once we have a dependency-tracking paradigm in place with shake, the only thing that's left is to specify the command or commands to build the software, which should be done in a shell.

MyriaCore commented 4 years ago

One caveat I will add is that it we could easily provide a declarative way of matching source files to actions. Like, maybe we match on the filename or the content in some way. My point is that at the end of the day, you're gonna need to write a command to compile anyway.

jhidding commented 4 years ago

Yes, you're right. I have been working on milkshake lately. This setup goes a little beyond what you're proposing. All the things Entangled does can already be decomposed into single shell commands (see manual). Doing post-tangle actions will fit in totally natural in this system, and also it will allow running code blocks (outside the context of Jupyter, which is not very reliable for languages other than Python; eg Bash support is currently broken) by defining custom actions. Most of these ideas are already documented in Milkshake.

entangled / entangled

post-tangle and post-stitch actions #67