Open bollwyvl opened 7 months ago
Oh, and got a bit wrapped around the form input: thanks for pixi
, and congrats on the recent big feature releases! :heart_eyes_cat:
He @bollwyvl, very nice issue report!
I think there is a misunderstanding of the feature of inputs
and outputs
. I'm not sure if it is the naming or the documentation but they're not used for task graph generation. Only the depends_on
will create a graph. The inputs
and outputs
are only used to define caching for the task.
[tasks.build]
inputs = ["build/preflight.txt"]
outputs = ["build/built.txt"]
cwd = "build"
cmd = """
cat preflight.txt > built.txt
&& sleep 1
&& echo "$(date) built" >> built.txt
"""
depends_on = ["preflight"] # <<<
[tasks.test]
inputs = ["build/built.txt"]
outputs = ["build/test.txt"]
cwd = "build"
cmd = """
rm -f test.txt
&& cat built.txt > test.txt
&& sleep 1
&& echo "$(date) tested" >> test.txt
"""
depends_on = ["build"] # <<<
Adding these two depends on creates the required graph.
Ps. I like the idea to generate mermaid graphs from the task definitions. :star_struck:
Ok, adding the explicit depends_on
:
diff --git a/pixi.toml b/pixi.toml
index dece1d5..d2aca77 100644
--- a/pixi.toml
+++ b/pixi.toml
@@ -17,6 +17,7 @@ cmd = """
"""
[tasks.build]
+depends_on = ["preflight"]
inputs = ["build/preflight.txt"]
outputs = ["build/built.txt"]
cwd = "build"
@@ -27,6 +28,7 @@ cmd = """
"""
[tasks.test]
+depends_on = ["build"]
inputs = ["build/built.txt"]
outputs = ["build/test.txt"]
cwd = "build"
@@ -38,4 +40,4 @@ cmd = """
"""
And running...
pixi build
Then...
pixi run clean
pixi run test
Yields...
✨ Pixi task (default): mkdir -p build
&& cd build
&& echo "$(date) preflight" > preflight.txt
&& echo "$(date) preflight extra" > preflight-extra.txt
Task can be skipped (cache hit) 🚀
✨ Pixi task (default): cat preflight.txt > built.txt
&& sleep 1
&& echo "$(date) built" >> built.txt
Task can be skipped (cache hit) 🚀
✨ Pixi task (default): rm -f test.txt
&& cat built.txt > test.txt
&& sleep 1
&& echo "$(date) tested" >> test.txt
Task can be skipped (cache hit) 🚀
...Does not invalidate the graph, and it still thinks everything is fine. It can't be forced to actually run any of the tasks without digging into the .pixi
folder.
It's also surprising that while there are a lot of emoji emitted, as well as the full body of the task, the name of the task running does not appear.
My goal (ha) is to for the time one person takes to capture the task dependency information in a pixi
-compatible way, however it has to be captured, to yield the benefits for a whole project of other graph-based systems (make
, doit
, etc), which would include:
Some other things that would be lovely that a rock-solid graph can provide:
Having to apply another task tool provisioned by pixi, but defined in another file would make things... even more confusing.
generate mermaid graphs from the task definitions.
As for mermaid: sure, it's really quite nice, and getting more portable. More, in that sadly the github renderer doesn't yet support the elkjs backend, which can do some really nice things at scale:
I agree with you, this output or yours seems a bug. For me it does run the preflight
but it doesn't invalidate the cache going down the graph.
Pinging @wolfv to ask for an explanation on the current design.
I see today's release :rocket: has a tree
command for packages by env: very nice! As a corollary to the markdown-style output, a tig
-style pixi task graph
would be another magical way to display robust task graph data:
state name depends_on why state?
───── ────────── ─────────┬ ─────────────────────────
✓ preflight ●
? build ◓╯ 1 inputs changed
? test ○╯ 1 depends_on out-of-date
───── ────────── ────────── ─────────────────────────
clean ⊛
───── ────────── ────────── ─────────────────────────
Looking a bit at file_hashes.rs
, it seems like it only uses globs, so a missing file indeed wouldn't get picked up. Not sure how to fully reason about it: globs, glob intersections, and missing glob things all get very tricky.
Point taken regarding not logging the skipped task name.
I debugged this and the main issue is that we take into account the cwd
for the inputs/outputs tasks.
If you change your pixi.toml to:
[tasks.build]
depends_on = ["preflight"]
inputs = ["preflight.txt"]
outputs = ["built.txt"]
cwd = "build"
cmd = """
cat preflight.txt > built.txt
&& sleep 1
&& echo "$(date) built" >> built.txt
"""
Things start to work.
Now, I am not sure what we should do ... use the cwd
or not? I do think that we may not want to use the cwd
for the inputs
and outputs
path computation.
Lastly, we should warn the user when the globset didn't match anything at all so that they can further debug. If inputs or outputs are specified, it can be expected that the user meant to select something.
Nice, #1202 looks like a good update, and dropping the cwd
is indeed solid. Again, whatever the convention has to be, is fine, as long as its consistent: i think anchoring to the PROJECT_ROOT is the least surprising thing to do. The /build
syntax specifically looks weird to me, so not requiring that to be able to use cwd
will probably feel good.
Regarding some of the debugging: having a way to introspect from the CLI would be super helpful with e.g. pixi task info [--json] [TASK...]
, which would complement pixi task graph
: from experience, it's not really plausible to display both a linked dual task and file hierarchy for the general case of all tasks in any project, but showing the task-specific slice of both would be very helpful.
Checks
[X] I have checked that this issue has not already been reported.
[X] I have confirmed this bug exists on the latest version of pixi, using
pixi --version
.Reproducible example
But fails in unexpected ways with most other invocations:
Expected behavior
I guess my mental model of
inputs
andoutputs
is something like, given the above graph:Where deleting a file would either mark a task as directly invalidated (❌) or indeterminate (❓), since the parent state is unknown:
preflight
build
test
clean
pixi.toml
preflight.txt
preflight-extra.txt
built.txt
test.txt
Running a goal task would first ensure all of the parent tasks were valid, and not even bother checking e.g.
cwd
.I also see
depends_on
is a thing, but requires double-encoding of both a concrete file name and a semantic name feels very unsatisfying.