Proposal: Add target distance metrics

alex-torok commented 2 months ago

I'm giving a bazelcon talk in a few weeks about a target distance metric system that I added to an internal bazel-diff-like tool that we have at my company. Target distance metrics are used to power some pretty interesting test avoidance workflows in our CI system.

Ahead of my talk, I'd like to start a discussion for adding a similar functionality to bazel-diff, as it is one of the more popular bazel changed target computation tools. I'm opening this ticket to see if bazel-diff would like to take a contribution on this. If so, I'd like to hear from maintainers on what a recommended UX would be and if there are any obvious pitfalls with implementing something like this in the bazel-diff codebase - I haven't contributed to bazel-diff before and am not very familiar with Kotlin.

Overview

First, some terms:

Directly changed target: A target that is changed due to a source file change, rule attr change, or rule implementation hash change.
Indirectly changed target: A target that is changed due to a target dependency being changed.
Changed target: a directly or indirectly changed target

Target distance metrics are integer values assigned to each changed target to denote how far an indirectly changed target is from a directly changed target. We compute two metrics:

target distance: The number of dependency hops to a directly changed target. Directly changed targets have a target distance of zero, with dependencies of directly changed targets having a distance of one, dependencies of those targets having a value of two, etc.
package distance: The number of dependency hops that cross a package boundary to a directly changed target. Similar to target distance, but only increment when a dependency edge crosses a package boundary.

Package distances are useful, as they can provide a more consistent view of changes across language ecosystems. i.e golang has one target per package, whereas other languages may have dozens of interconnected targets in the same package. A target distance threshold of N may have vastly different meanings between these two ecosystems, whereas package distances are somewhat more consistent across language ecosystems.

Usecases

Here are some examples of how target distance metrics can improve developer experience and automated testing:

Run sanitizers on presubmit for changed tests in the same package as a directly modified target (package distance of zero)
Avoid flaky tests by only running large- sized tests if they are within N target hops in the build graph. (extra filtering prior to running bazel test)

Implementing and using target distance metrics have allowed us to run more relevant tests in presubmit CI while avoiding flaky tests that are more likely to be triggered due to changes to common code that capture a large number of distant changed targets.

Test escapes can happen due to skipping, but happen infrequently and can be easily adjusted by adjusting the distance thresholds.

Implementation

To compute these metrics, an additional hash component is generated and tracked:

Target hash (same as existing target hash)
Direct dependency hash (hashes for srcs, rule attrs, and rule implementation hashes)

The additional direct dependency hash will need to be generated and returned by the generate-hashes command. This could probably be tracked as a detailed-hashes.json file.

To compute the distance metrics the get-impacted-targets command will need the detailed hash values from above, along with a dependency mapping for targets (dict of label to list of dependency labels) from the build graph where the final hashes come from. The generate-hashes command could be updated to have a flag that produces the mapping in a json file at the given path.

Once all of this information is available, we'd perform a depth-first traversal of the changed target graph, looking for targets who's direct dependency hash has changed. We set that target's target and package distance to zero, then increment those values into each depending target's distance metric as we step back up the graph. The values are always selected to represent the shortest path through the graph.

I'm expecting that some additional flags will be needed on get-impacted-targets to pass in the dependency mapping json file and instruct it to perform the distance metric computation.

Instead of the output from get-impacted-targets being a list of target labels, it would be changed to be a json list of dicts like so:

[
   { 
        "label": "//foo:bar",
        "target_distance": 2,
        "package_distance": 0
   },
   ...
]

Questions

Is this something that you would be interested in accepting PRs for?
Are there any obvious trap doors that I should look out for in the areas where I'd likely need to make these changes?

tinder-maxwellelliott commented 2 months ago

Hey there @alex-torok , thank you so much for writing all this up. I have more than enough context on what the proposal is trying to accomplish and I would be interested in using this for our own projects.

Is this something that you would be interested in accepting PRs for?

Absolutely, I can also provide guidance in the codebase as needed

Are there any obvious trap doors that I should look out for in the areas where I'd likely need to make these changes?

I don't believe so. Yes the code is in Kotlin but it has a pretty approachable structure overall and has meaningful coverage

Id love to have this feature

alex-torok commented 2 months ago

Sorry for taking a bit to get back here - the past few weeks have been pretty hectic for me.

I haven't started yet, but have poked around the codebase to find all of the areas that will need to be changed. I agree that it shouldn't be too bad! I have a few non-urgent questions for you:

Does it matter if I change the hashing implementation in a way that affects the output hashes produced by the tool? For our internal system, we cache hash results and need to follow some careful processes for rev-ing the hashing implementation to avoid cache inconsistency. Does that matter here, or am I free to change the code however it is convenient?
For our implementation, we track rule implementation hashes and seed file hashes in the "direct hash" with the rule attrs and srcs. This means that changes to a rule implementation function (or the bzl file's transitive loads) leads to all targets of that rule kind to be directly changed (target_distance=0). The same goes for seed file changes. This has generally worked out okay for us.

Do you agree with this decision to have the rule implementation and seed hash modify the direct hash, or do you have other opinions on how it should work? I've considered keeping track of them all separately so that I could provide a change reason with each changed target (SEED_FILE, TARGET, RULE_IMPLEMENTATION), but haven't really been pressed with a need for such detail.

tinder-maxwellelliott commented 2 months ago

Does it matter if I change the hashing implementation in a way that affects the output hashes produced by the tool? For our internal system, we cache hash results and need to follow some careful processes for rev-ing the hashing implementation to avoid cache inconsistency. Does that matter here, or am I free to change the code however it is convenient?

I am comfortable with modifying the hashing function, this will be in a breaking change for the repository regardless

For our implementation, we track rule implementation hashes and seed file hashes in the "direct hash" with the rule attrs and srcs. This means that changes to a rule implementation function (or the bzl file's transitive loads) leads to all targets of that rule kind to be directly changed (target_distance=0). The same goes for seed file changes. This has generally worked out okay for us.

This seems correct to me, if anything its more precise then what we are doing today

This proposal looks fine to me, happy to see it progressing

Tinder / bazel-diff