garden-rs / garden

Garden grows and cultivates collections of Git trees ~ Official mirror of https://gitlab.com/garden-rs/garden
https://garden-rs.gitlab.io
MIT License
66 stars 9 forks source link

configure to not skip not existing tree on command #33

Closed jhoogeboom closed 7 months ago

jhoogeboom commented 7 months ago

I'm growing my garden file with some backup commands to backup and restore from borg. As a test I'm trying to to restore from a borg backup, but trees that are not grown, are skipped for commands.

example:

variables:
  source: "${HOME}"
  storage: "${HOME}/projects/garden/storage"
  borg: "${HOME}/projects/garden/borg"
  borgname: borg
  borgsecret: secret
garden:
  root: "${HOME}"
templates:
  storage:
    url: "${source}/${TREE_NAME}"
    remotes:
      storage: "${storage}/${TREE_NAME}.git"
    commands:
      setup: |
        mkdir -p ${storage}
        git init --bare ${storage}/${TREE_NAME}.git
      push-storage: "git push storage \"$@\""
      push-var: "git push ${storage} \"$@\""
      borg-setup: |
        mkdir -p ${borg}/${TREE_NAME}
        BORG_PASSPHRASE=${borgsecret} borg init --append-only --encryption=repokey ${borg}/${TREE_NAME}
      borg-backup: |
        cd ${TREE_PATH}
        BORG_PASSPHRASE=${borgsecret} borg create ${borg}/${TREE_NAME}::{now} .
      borg-restore: |
        mkdir -p ${TREE_PATH}
        cd ${TREE_PATH}
        snapshot=$(BORG_PASSPHRASE=${borgsecret} borg list ${borg}/${TREE_NAME} --last 1 --short)
        BORG_PASSPHRASE=${borgsecret} borg extract ${borg}/${TREE_NAME}::$snapshot
      borg-check: |
        BORG_PASSPHRASE=${borgsecret} borg check ${borg}/${TREE_NAME}
trees:
  projects/garden/repo1:
    templates:
      - storage

If I do garden borg-restore projects/garden/repo1 it is skipped with # projects/garden/repo1 (skipped). Is there a way to override this skipping?

davvid commented 7 months ago

There's not a direct way to do this currently. Thanks for the example config -- I think I understand why you would want to have a way to run commands even though the tree doesn't exist yet.

The simplest way to do something like this today would be to move the commands to a top-level commands block (outside of the tree context) and then add a "root" tree that garden will use when it runs commands from the root.

e.g.

commands:
  setup: ...

trees:
  root:
    path: ${GARDEN_CONFIG_DIR}

I think there could be some utility in extending custom commands so that they can run against missing trees. The main thing that we'd have to change is to add an opt-in flag (e.g. garden <cmd> --missing ... and then come up with a rule about what the current directory should be when running commands over missing trees.

Right now we always run commands with the current working directory set to the tree's location. A somewhat straightforward extension to the behavior would be to use the ${GARDEN_ROOT}directory as the fallback directory to use when operating on missing trees. garden.root defaults to ${GARDEN_CONFIG_DIR} so it'd effectively be running in the same directory as the garden.yaml garden file, but in your example above it would run from ${HOME}.

A further extension of this idea would be to make this configurable so that command-line options are not needed, e.g.

garden:
  missing-tree-path: ${GARDEN_CONFIG_DIR}

... and then whenever we encounter a missing tree we would use the specified fallback path.

The only downside to these two approaches is that there's only a single global fallback directory. If we wanted it to be flexible and granular we could extend the tree spec so that each tree can specify a custom fallback location.

e.g.

trees:
  example:
    fallback-path: ${HOME}

This is the most flexible since it'd let us be very specific and we can control the behavior on a per-tree basis. I kinda like that option the best and it should be relatively straightforward to implement. This would apply to templates as well, of course.

The ${TREE_PATH} and other variables would not change, though - we'd probably still want ${TREE_PATH} to refer to the tree's real path, not the fallback path. We could provide a separate ${TREE_FALLBACK_PATH} but we might not even need a variable for the fallback path. I'm curious to hear your thoughts on that.

Does a per-tree fallback path look like a workable solution? If so I can take a stab at getting this into the next feature release.

jhoogeboom commented 7 months ago

I'm not sure if I can follow your first example. If I move it like you describe:

---
variables:
  source: "${HOME}/projects/garden/storage"
  storage: "${HOME}/projects/garden/storage"
  borg: "${HOME}/projects/garden/borg"
  borgname: borg
  borgsecret: secret
commands:
  setup: |
    mkdir -p ${storage}
    git init --bare ${storage}/${TREE_NAME}.git
  push-storage: "git push storage \"$@\""
  push-var: "git push ${storage} \"$@\""
  borg-setup: |
    mkdir -p ${borg}/${TREE_NAME}
    BORG_PASSPHRASE=${borgsecret} borg init --append-only --encryption=repokey ${borg}/${TREE_NAME}
  borg-backup: |
    cd ${TREE_PATH}
    BORG_PASSPHRASE=${borgsecret} borg create ${borg}/${TREE_NAME}::{now} .
  borg-restore: |
    mkdir -p ${TREE_PATH}
    cd ${TREE_PATH}
    snapshot=$(BORG_PASSPHRASE=${borgsecret} borg list ${borg}/${TREE_NAME} --last 1 --short)
    BORG_PASSPHRASE=${borgsecret} borg extract ${borg}/${TREE_NAME}::$snapshot
  borg-check: |
    BORG_PASSPHRASE=${borgsecret} borg check ${borg}/${TREE_NAME}
# garden:
#   root: "${HOME}"
templates:
  storage:
    url: "${source}/${TREE_NAME}"
    remotes:
      storage: "${storage}/${TREE_NAME}.git"

trees:
  root:
    path: ${GARDEN_CONFIG_DIR}
  projects/garden/repo1:
    templates:
      - storage

And I do a garden borg-restore projects/garden/repo1, it is skipped. I understand that with the root tree I can then use that to do something like garden cmd root but it is missing then the ${TREE_PATH} that is use in the commands.

The ${TREE_PATH} and other variables would not change, though - we'd probably still want ${TREE_PATH} to refer to the tree's real path, not the fallback path. We could provide a separate ${TREE_FALLBACK_PATH} but we might not even need a variable for the fallback path. I'm curious to hear your thoughts on that.

I think it would be best if the ${TREE_PATH} would be the real tree path. Not sure what I would use the ${TREE_FALLBACK_PATH} or what the value of it would be.

davvid commented 7 months ago

I'm not sure if I can follow [...]

That's my fault for not spelling out the caveats...

but it is missing then the ${TREE_PATH} that is use in the commands

The caveat is that root commands can only be a simple initial setup command that's independent of the other trees or their variables. In this case, we're using ${TREE_PATH} and ${TREE_NAME} in the commands so in order to make that work we'll need to change garden.

What I had in mind was that only a single command would be added at root scope (or under a fake tree called "root" like in the example below) and the rest of the commands would stay grouped under the tree just like in your original example.

I don't know much about borg backup so pardon the surface-level questions ~ is your workflow such that borg will create the repo directories, manage .git and manage creating the worktree for you? Along those lines, you're not using garden grow to create the initial repos in your workflow, correct?

The point being that your inception point, as far as I now understand it, isn't going to go through git clone under the hood like it usually happens with garden grow. You're looking to let borg create stuff instead.

The default behavior of supporting sparsity (by skipping trees) is there because, for example, I might share a garden file across desktops and laptops and each machine might have a different sparse set of trees checked out.

It looks like the workflow you're looking to attain with borg is that borb will create all the repos for you, so instead of garden grow you'd like to have it run a custom command. The custom command has to run in a tree's context due to all of the ${TREE_PATH} and ${TREE_NAME} references in the commands.

The only way to do that is going to be to extend garden to let us run commands against missing trees.

Let me know if that sounds right to you.

What I'll start hacking on is making it so that garden borg-restore -f projects/garden/repo1 runs the command as expected.

-f would be shorthand for --force, but I'm all ears if you have a better suggestion for the short and long option names.

Oh, and in case you wanted to make this work today, I think this would do it, albeit with some repetition in the config file since the tree names would be repeated in a two places. The workaround is to run mkdir separately, up front.

commands:
trees:
  projects/garden/repo1:
    templates: storage
  # rest of trees here
  root:
    path: ${GARDEN_ROOT}
    commands:
      init-trees: |
        for repo in repo1 repo2 repo3 repo4
        do
            mkdir -p projects/garden/$repo
        done

# templates, etc.

Once you add that you can cd to the garden root directory and run:

garden init-trees
garden borg-restore 'projects/garden/*'

and it should do the right thing and create the initial directories. Creating the initial directory is all it takes for garden to make commands run, which is what garden init-trees does. The subsequent garden borg-restore ... then works as expected because the directories were created by the previous command.

With that spelled out, it's pretty clear that it'd be pretty sweet if we could eliminate the garden init-trees step. Let me know what you think about the option naming, etc.

At this point I don't think we even need to extend the tree spec. It's probably better to keep it simple and just add a command-line option since it'll probably only be used once. If there's some future case for per-tree fallback dirs we can always add that later.

For now, if the tree path is missing (and --force is specified) we'll use the ${GARDEN_ROOT} directory and run the commands from there instead.

If the ${GARDEN_ROOT} is also missing we'll also try falling back to ${GARDEN_CONFIG_DIR} since we know that it must exist (otherwise the config file wouldn't exist either).

Once we add the --force option then we won't need the init-trees command or the root tree, so it's a nice simplification in my book.

davvid commented 7 months ago

If you try the latest from git this should work now:

garden borg-restore -f 'projects/garden/*'

Note the -f option.