Idea: Compile Single Script

Our conversation in #81 got me thinking - what if we're thinking about this all wrong? Rather than writing a lot of logic in awk to get things executing in the correct order, why don't we instead just focus on "compiling" the Makesurefile file into an executable script?

For example, say you have this code in your Makesurefile:

readonly A="this is setup code"

@goal default
@reached_if [ -f foobar ]
    echo "${A}" > foobar

It seems like we could pretty easily compile this into the following Bash script:

#!/usr/bin/env bash
readonly A="this is setup code"

function goal:default() {
    function reached() {
        [ -f foobar ]
    }
    function exec_dependencies() {
        true # No dependencies - noop
    }
    function exec_goal_body() {
        echo "${A}" > foobar
    }
    if reached; then
        echo "  goal 'default' [already satisfied]"
    else
        exec_dependencies
        echo "  goal 'default' ..."
        exec_goal_body
    fi
}

"goal:${1:-default}"

Here's what it might look like with a more typical Makesurefile:

readonly A="this is setup code"
readonly B="it should only run once"

@goal foobar @private
@reached_if [ -f foobar ]
    echo "${A}" > foobar

@goal whatever @private
@reached_if [ -f whatever ]
    echo "${B}" > whatever

@goal clean
    rm -f foobar whatever

@goal default
@depends_on foobar whatever
    cat foobar
    cat whatever

This would compile to:

#!/usr/bin/env bash

readonly A="this is setup code"
readonly B="it should only run once"

function goal:foobar() {
    function reached() {
        [ -f foobar ]
    }
    function exec_dependencies() {
        true # No dependencies - noop
    }
    function exec_goal_body() {
        echo "${A}" > foobar
    }
    if reached; then
        echo "  goal 'foobar' [already satisfied]"
    else
        exec_dependencies
        echo "  goal 'foobar' ..."
        exec_goal_body
    fi
}

function goal:whatever() {
    function reached() {
        [ -f whatever ]
    }
    function exec_dependencies() {
        true # No dependencies - noop
    }
    function exec_goal_body() {
        echo "${B}" > whatever
    }
    if reached; then
        echo "  goal 'whatever' [already satisfied]"
    else
        exec_dependencies
        echo "  goal 'whatever' ..."
        exec_goal_body
    fi
}

function goal:clean() {
    function reached() {
        false # No @reached_if - always execute goal
    }
    function exec_dependencies() {
        true # No dependencies - noop
    }
    function exec_goal_body() {
        rm -f foobar whatever
    }
    if reached; then
        echo "  goal 'clean' [already satisfied]"
    else
        exec_dependencies
        echo "  goal 'clean' ..."
        exec_goal_body
    fi
}

function goal:default() {
    function reached() {
        false # No @reached_if - always execute goal
    }
    function exec_dependencies() {
        goal:foobar
        goal:whatever
    }
    function exec_goal_body() {
        cat foobar
        cat whatever
    }
    if reached; then
        echo "  goal 'default' [already satisfied]"
    else
        exec_dependencies
        echo "  goal 'default' ..."
        exec_goal_body
    fi
}

"goal:${1:-default}"

Once this script is compiled, it would be a simple matter of just executing it with the shellExec() function. It's a fairly simple Bash script you could save to a file and run if you wanted, and it contains most of the important features of Makesure.

The advantages I can think of:

The prelude only executes once.
We would no longer need to manually specify @lib, @use_lib, or @define - the prelude can just always be in scope.
I suspect it would simplify a lot of code in makesure.awk.

What do you think?

I know I left out lots of details like detecting circular dependencies and whatnot... That kind of logic needs to probably remain as it is now. The actual script execution would just be the last stage, after we've already validated the Makesurefile.

Actually the very similar approach is utilized by Taskfile project, but without the preprocessing (awk) step.

Believe it or not, but this is almost how it was implemented initially, we even used to have -p option to output the resulting shell script. But this was eventually discarded, because the approach proved to add more issues than it solves.

Multiple reasons for this:

We want to enforce independence of goals. With a single-script execution model no one prevents the user to declare vars in one goal and use it inside another thus making the logic imperative and convoluted.
We support -t option for timings. This option is almost incompatible with the whole-shell approach. It's possible, but needs to put lots of code in the shell (vs awk) which is unpleasant and makes the final shell bloated.
sh vs bash nuances. This is much more straight-forward with the current approach.
The aforementioned issue with dependency order calculation. Yes, you can do it in bash similar to Taskfile approach. But then you need to do a lot of manual work around implementing run-dependency-only-once semantics, and it's not possible to detect circular dependency at all (will just cause infinite loop).
The initial idea of generating shell first seems to be good for two reasons. 1) It can simplify the debug because it's easier to spot a bug looking at generated script 2) It opens an option to then reuse the resulting script on its own if need be. At closer look none of these really holds. It appears that it's almost impossible to prepare a self-contained shell script functionally equal to initial Makesurefile due to issues above. So it is, but then it'll lack timing, good dependency calculation, proper goals isolation, etc. In other words, it won't hold the semantics of the original Makesurefile. It'll also lack the best parts of makesure, like goals introspection via -l/-la. Also, as "there always should be only one way" it's literally no reason at all to expose the generated shell script, because it will allow some unexpected usage scenarios.

Very good points. I have some thoughts about them, but please don't interpret this as me trying to argue or convince you of anything. I share them only to help flesh out the idea and be helpful.

For point 1, you're right: My compiled script above does have the problem of goals not being independent. However using subshells would largely fix this, i.e.

function goal:default() {
    (
        # Now we're running in a subshell, so goals can't modify global state
    )
}

For point 2... yeah you're probably right. To be honest, timing isn't terribly important to me personally. The Taskfile approach to timing seems like it's not terribly convoluted with the time command, but I don't know how well it works in the real world.

Point 3: It looks like to support sh, we'd just have to change functions so that instead of function goal:default() it would need to be something like __goal_default().

Point 4: I totally agree that dependency order calculation, circular dependency detection, etc. should definitely NOT go in the script. Also the listing of goals via -l. I do think all that should stay in awk. The script would be generated after awk has already figured that stuff out. Then after the script has been executed, it would just be discarded.

I do think the "single compiled script" approach leaves more room for users to abuse things in ways I haven't considered... but it strikes me as adhering more to the "worse is better" philosophy. I'm not terribly concerned about users who are trying to shoot themselves in the foot - I'm happy as long as we just avoid giving them footguns that are easy to make mistakes with.

So let's return to the mentioned advantages

The prelude only executes once.

But it seems that the whole idea of prelude as a script running before all goals is questionable. I'm considering the removal of this feature in its current form in https://github.com/xonixx/makesure/issues/84

We would no longer need to manually specify @lib, @use_lib, or @define - the prelude can just always be in scope.

Well if need be we can handle this in current model by making the @lib default (or even unnamed @lib) to mix to all goals implicitely (though I'm very suspicious to any feature that includes the element of "implicit").

I suspect it would simplify a lot of code in makesure.awk.

So seems like this could be the one of "real" reasons to do it. However, I doubt it would make it much easier. The opposit effect however possible.

Relying on imperative script as internal execution model makes the whole thing more imperative in nature. I see this as limiting. The value of current execution model is that goals are separate entities and can be handled so. So we can wrap each goal in any common code we need (error handling, timing, logging, etc.).

Due to considerations above I would prefer to close this for now.

xonixx / makesure

Idea: Compile Single Script #83