Feature request: Cached recipes

tgross35 commented 8 months ago

For quite a few CMake projects, I have a separate recipes for configuration and build, with configure being a dependency of build. Normally the configuration does not need to be rerun unless variables change, and I would prefer it runs as little as possible because it can sometimes take longer than the build.

Currently I have something a bit messy that stores a hash of all captured variables to check if anything changed:

# Configure Cmake
configure build-type="Debug" projects="clang":
    #!/bin/sh
    # Hash all configurable parts 
    hash="{{ sha256(source_dir + build_dir + build-type + install_dir + projects + linker_arg) }}"
    if [ "$hash" = "$(cat '{{config_hash_file}}')" ]; then
        echo "configuration up to date, exiting"
        exit
    else
        echo "config outdated, rerunning"
    fi

    printf "$hash" > "{{config_hash_file}}"

    cmake "-S{{source_dir}}/llvm" "-B{{build_dir}}" \
        -G Ninja \
        -DCMAKE_C_COMPILER_LAUNCHER=sccache \
        -DCMAKE_CXX_COMPILER_LAUNCHER=sccache \
        -DCMAKE_EXPORT_COMPILE_COMMANDS=true \
        "-DCMAKE_BUILD_TYPE={{build-type}}" \
        "-DCMAKE_INSTALL_PREFIX={{install_dir}}" \
        "-DLLVM_ENABLE_PROJECTS={{projects}}" \
        "{{linker_arg}}"

My suggestion is to add a way to do this by default. The above would become:

[cached]
configure build-type="Debug" projects="clang":
    cmake "-S{{source_dir}}/llvm" "-B{{build_dir}}" \
    # ...

And Just would need to do the following:

Evaluate all captures in the recipe
Create a deterministic hash of the captures (and maybe also environment variables?)

If never run before, store this information in the cache directory. Roughly

{
"cached_recipes": [
  { "path": "/home/user/project/justfile", "recipe": "configure", "hash": "09ca7e4e...", "last_run": "2024-01-21T08:40:52Z" },
  // ...
]
}

If an entry for that file and recipe already exists, compare the hash. Skip if it is the same

A more flexible alternative is to have the user specify what gets set as a cache key. This would be easier for Just to implement too, but is less user friendly.

[cache_keys(source_dir, build_dir, build-type, install_dir, projects, linker_arg, `$SOME_ENV`)]
configure build-type="Debug" projects="clang":
    cmake "-S{{source_dir}}/llvm" "-B{{build_dir}}" \
    # ...

This is slightly related to https://github.com/casey/just/issues/867 since a lot of the use of file dependencies is cache.

casey commented 8 months ago

I think this would probably create a long-tail of tricky issues. I've had the experience with a few build systems that sometimes things get cached when they shouldn't, like if a file on disk, environment variable, or binary changes, but the build system isn't aware of it. So I'm open to adding this, but only if it has a simple, minimal implementation which is easy for users to understand. I kind of suspect that this isn't possible, but I'll leave this open in case someone can come up with something clever.

rhysparry commented 6 months ago

Something that I have done in the past is use || in the command to short-circuit the recipe. This can work well for simple things, but can get a little clumsy as the logic gets more complicated (as in @tgross35's original example).

E.g.

configure:
    [ -f configured ] || ./run-slow-process-to-configure.sh

In this example, it just checks for the existence of the file configured, and if it exists, it doesn't do the rest.

If we could better structure this sort of short-circuiting in just this might provide a straightforward way for users to define how they want to control any skipping behaviour in recipes.

Creating another recipe that defines the logic might be a natural way to achieve this.

configured-file-exists:
    [ -f configured ]

Then it would be a case of deciding how to best express this requirement.

E.g. an attribute

[short_circuit(configured-file-exists)]
configure:
    ./run-slow-process-to-configure.sh

As part of the recipe line:

configured-file-exists || configure:
    ./run-slow-process-to-configure.sh

Or some other way.

rhysparry commented 6 months ago

Another approach that leverages the existing method for defining dependencies. Create independent recipes for each of the alternative branches and then a "gate" recipe as the mechanism to combine them.

[private]
check-if-already-configured:
    # do the check

[private]
do-actual-configure:
    # do the configure

configure: (check-if-already-configured || do-actual-configure)

I like that this approach leverages the existing recipe mechanism, although it does lean into having a recipe failing (albeit with a path to recover the overall run).

jrouaix commented 1 month ago

Hi, I found this issue after having implemented a first draft for our own solution :

# Handle recipe cached completions

# completion cache directory
completions_cache_dir := justfile_directory() / ".recipe.completions"

clear_cached_runs: 
  @rm -rf {{completions_cache_dir}}
  @echo "Completions cleared"

# exec a recipe only if it has not been successfully completed yet
cached_run recipe:
  #!/usr/bin/env sh
  if 
    test -f {{completions_cache_dir}}/{{recipe}}.completed
  then 
    echo "'{{recipe}}' already completed"
  else
    just "{{recipe}}"
    mkdir -p {{completions_cache_dir}}
    touch {{completions_cache_dir}}/{{recipe}}.completed
  fi

######################################
# -----         EXAMPLES       ----- #

# test recipe (this is a syntaxe test, we'll have to declare 1 more line for each recipe)
init_mob: 
  @echo "INIT_MOB"

test_mob: (cached_run "init_mob")
  @echo "TEST_MOB"

build_mob: (cached_run "test_mob")
  @echo "BUILD_MOB"

publish_mob: init_mob build_mob

Using this example :

$ just test_mob
'init_mob' already completed
TEST_MOB

$ just build_mob
'init_mob' already completed
TEST_MOB
BUILD_MOB

$ just publish_mob
INIT_MOB
'test_mob' already completed
BUILD_MOB

Would be awesome to have a special syntax for that ! 🚀

casey / just

Feature request: Cached recipes #1861