Design a strategy for secondary effects of builds (including lockfile generation)

stuhood commented 3 years ago

The side-effects of builds are currently applied-in and limited to @goal_rules via access to the Workspace type. But there is a common case of side-effects which (rather than being the core reason for having run the goal, such as with fmt/tailor) are a "secondary" effect: lockfiles.

In common usage, lockfiles are (re)generated automatically during the course of a build: i.e., if you were to run ./pants test x after making an edit to a requirement, a lockfile would be updated as a secondary effect of the build.

With the current design for side-effects, all goals which might cause a lockfile to be regenerated would need their signatures changed. On the one hand, this makes them more explicit: a goal without explicit support for lockfile handling (or at least a more abstract "affected files Snapshot" concept) would not support it. On the other hand, making side-effects explicit might be helpful, and improve reasoning about when they are applied (for example: the test goal could choose to apply effects regardless of whether some tests had failed, etc).

An alternate design might involve explicitly supporting buffering up Workspace.materialize_directory calls (or their equivalent) in @rules (rather than only in @goal_rules, and applying them only if the entire request succeeds.

stuhood commented 3 years ago

cc @patricklaw , @Eric-Arellano

Eric-Arellano commented 3 years ago

We already have some prior art here: writing report files with ./pants test and ./pants lint. Imo, keeping it explicit remains the way to go. Correctness and being able to reason about what Pants will do is more important than concise code.

stuhood commented 3 years ago

We already have some prior art here: writing report files with ./pants test and ./pants lint. Imo, keeping it explicit remains the way to go. Correctness and being able to reason about what Pants will do is more important than concise code.

Sure, but those reports are arguably directly artifacts of running those goals. In the case of lockfiles, those goals (and a few others... package, etc) would need to be updated to write out a lockfile explicitly, and the sideeffect Digest would need to be attached to all of the intermediate types that they consume (i.e. TestResult, LintResult, etc would need to be updated with a sideeffect Digest field).

stuhood commented 2 years ago

While investigating #14885, it occurred to @tdyas and I that the workunit store and StreamingWorkunitHandler already represent a way to provide a stream of a side-channel outputs. In #14885, EngineAware(Param|ReturnType) would be used to send progress notifications. But in theory, you could also attach well-known side-effect types to EngineAwareReturnType which the engine would consume itself to apply at the end of a run.

The primary challenge with collecting side-effects from workunits is that workunits do not tie very directly to the request node: and in particular, if something is re-tried or speculated, you would want to (transactionally) ignore all but the workunit for the successful node.

So it's possible that this could use EngineAwareReturnType (or a new neighboring type which is more specific to side-effects) to optionally collect the matching (transitive) children of a successful scheduler.execute request.

pantsbuild / pants

Design a strategy for secondary effects of builds (including lockfile generation) #12014