go-task / task

A task runner / simpler Make alternative written in Go
https://taskfile.dev
MIT License
11.59k stars 618 forks source link

Interpolated variables should be factored into checksum #548

Open Porges opened 3 years ago

Porges commented 3 years ago

Example Taskfile showing the issue

version: '3.7'

run: when_changed

vars:
  VERSION: 1

tasks:
  generate-file:
    sources:
    - base.txt
    cmds:
    - <base.txt sed -e 's/VERSION/{{.VERSION}}/g' > versioned.txt

When changing VERSION from 1 to 2, this should cause generate-file to re-run. At the moment it doesn’t because interpolated variables are not incorporated into the checksum.

Porges commented 3 years ago

Also, ideally, the task itself would be checksummed somehow, so that if I change the commands it will re-run.

andreynering commented 3 years ago

Hi @Porges,

It's currently possible to instruct Task to invalidate the checksum on specific variables changes by adding a label: prop to the task:

version: '3'

tasks:
  generate-file:
    cmds:
      - ...
    label: 'generate-file-{{.VERSION}}'

It's a valid discussion whether we should do that by default or not, as you suggested. I think it may be unexpected to some users. For example: if a variable isn't deterministic (a date/time or something) the task will always run and that may not always be the expected behavior.

This makes me think that, if done, there should probably be a setting to choose the appropriate behavior. But maybe label: is enough and we would not even need that...

ghostsquad commented 2 years ago

I found myself changing my task file, and later wondering how I was going to invalidate the cache automatically. It's not really just vars, it's actually the entire task definition. I imagine that you may want to checksum the output of --dry-run as part of the caching behavior, or at least some flag to enable or disable that.

I'm curious what people think about having the default behavior include the task definition itself (and variable values) as part of the checksum, with a flag to disable 1 or both of those?

ahus1 commented 2 years ago

@ghostsquad: +1 for making the task definition part of the checksum (not sure about interpolations as a new user yet). I admit that I'm new to task, still the first thing I did was adding Taskfile.yaml as one of the sources (which would build too much).

ghostsquad commented 2 years ago

For now, I'm including Taskfile.yml itself in sources. This is equivalent of using a sledge hammer when a rubber mallet would be sufficient.

ahus1 commented 2 years ago

This is equivalent of using a sledge hammer when a rubber mallet would be sufficient.

I know, and it wasn't fun. Next iteration was task -f taskname, but found that it also forces the dependencies of a task to run.

Current iteration: a small script that extracts each task from the Taskfile.yaml and use that as a source: once the task changes, run the task again. Works OK for now.

# Taskfile.yaml
tasks:
  split:
    desc: Split Taskfile to one-file-per-task for dirty checking
    run: once
    cmds:
      - bash -c ./split.sh
    sources:
      - Taskfile.yaml
      - split.sh
    silent: true

  ipchange:
    deps:
      - split
    cmds:
      # ...
    sources:
      - .task/subtask-{{.TASK}}.yaml
# split.sh
set -euo pipefail
mkdir -p .task
rm -f .task/subtask-*.yaml
for i in $(yq -M e '.tasks | keys' Taskfile.yaml); do
   yq -M e ".tasks.${i}" Taskfile.yaml > .task/subtask-${i}.yaml
done
andreynering commented 2 years ago

Hi @ahus1,

Have you tried my suggestion done here? https://github.com/go-task/task/issues/548#issuecomment-941679515

I believe that's what you're are looking for.

ahus1 commented 2 years ago

@andreynering - your suggestion will solve AFAIK the problem when a variable changes, as the labels changes with the variable. I don't know what will happen if the variable changes back to the old value: Would it then run again, or will the state with the old variable value / label name will be still there and it will not run?

This thread evolved form interpolating variables to also include the body of the task definition, so that whenever the definition changes, the task should run again. The variable/label approach doesn't cover that, so I looked for something else. So I came up with the idea to split the taskfile by tasks, and then use that as a "sources" element. I did some iterations with that, and it looks ok for me. although a bit verbose.

ghostsquad commented 2 years ago

I think what is realistically needed is a "cache key". I don't know what else that label does. But generally, if we are able to derive our own "key" tied to the task definition itself, that means that we can invalidate the cache arbitrarily.

aminya commented 1 year ago

I tried using label: {{.TASK}}-{{.VARIABLE}} in addition to sources and generates, but still the task is incorrectly skipped. Is there a way to fix this issue? I have to disable the whole caching because of this bug

ypid-work commented 1 year ago

https://github.com/go-task/task/issues/548#issuecomment-900637214

Also, ideally, the task itself would be checksummed somehow, so that if I change the commands it will re-run.

455 already exists where this is discussed.

dcecile commented 11 months ago

My workaround is to add a silent command that does nothing except reference the used variables:

      - cmd: 'true # /{{.WATCH}}/{{.RESTART}}/{{.MODE}}/{{.TEST}}/{{.NAME}}/{{.ARGS}}/'
        silent: true

So it seems like interpolated variables are factored in with 3.29.1? I think my issue was some of my other usages weren't factored in.

Other usages ```yaml - task: :watchman:client vars: WATCH: '{{.WATCH}}' RESTART: '{{.RESTART}}' MODE: executable COMMAND: build/target/{{.MODE}}/intertwine-{{.NAME | trimSuffix "-rs"}}{{if eq .TEST "true"}}-test{{end}} {{.ARGS}} {{.CLI_ARGS}} ```
aminya commented 11 months ago

I fixed this issue in https://github.com/go-task/task/pull/1401

It just needs addressing some comments otherwise it works as expected.

reitzig commented 2 months ago

FWIW: The docs say (emphasis mine):

when_changed only invokes the task once for each unique set of variables passed into the task

As a newcomer, it was quite surprising to me that changes in vars did not cause a re-run!

The workaround with label works, I guess, but a) is not intuitive from the docs, b) does not look nice in the CLI output (in my case), and c) causes lots of files in .task/checksum over time.

My workaround is to (always) write the values (resp. a checksum of theirs) to a file and include that as source:

tasks:
  init-submodules-status:
    internal: true
    silent: true
    cmds:
      - git submodule status --cached | sha256sum | cut -d' ' -f1 > .task/git-submodules-status
    run: always
  init-submodules:
    deps:
      - init-submodules-status
    sources:
      - .gitmodules
      - .task/git-submodules-status
    cmds:
      - git submodule update --init
    run: when_changed

(Well, I need to re-run basd on changes in the output of a command, but I guess the approach carries over to writing the values of variables into files?)