bazel-contrib / target-determinator

Determines which Bazel targets were affected between two git commits.
Apache License 2.0
144 stars 22 forks source link

Possible circular configurations? #72

Closed purkhusid closed 1 year ago

purkhusid commented 1 year ago

I'm currently debugging an issue I'm having with target-determinator and it's seems like there is a cycle in the graphq that target determinator is processing. This sounds like it could be a possible bug in Bazel or target-determinator.

I'm working on a small repro but it's hard to figure out what is causing this. It seems like it's either caused by the heavy use of transitions in the rules I'm using or heavy use of aliases but I haven't been able to pin it down. The issue that this cycle is causing is that target-determinator freezes because there becomes a race condition for the lock in the hash cache for the label that has the cycle.

Just wanted to check if there are any known issue with cycles while I try to narrow this down.

illicitonion commented 1 year ago

Howdy! There is indeed a known issue! See https://github.com/bazel-contrib/target-determinator/issues/57 for more information.

The general context is that in Bazel 6 and before, cquery output unfortunately doesn't expose what configuration a dependency is needed in, so we do this horrible thing to just add a dependency on it in all configurations:

https://github.com/bazel-contrib/target-determinator/blob/9035fc670cdfc909304b208205fb7af0666f1bb0/pkg/hash_cache.go#L536-L539

But that's not correct, and can potentially lead to this kind of cycle.

In #58 we put in a hacky hacky hack to avoid this specifically for alias targets because we know they don't transition (and we know rules_go had an issue around aliases here):

https://github.com/bazel-contrib/target-determinator/blob/9035fc670cdfc909304b208205fb7af0666f1bb0/pkg/hash_cache.go#L518-L535

We added support to Bazel to know what configurations dependencies are needed in https://github.com/bazelbuild/bazel/pull/15038 but unfortunately ran into some bugs with it (https://github.com/bazelbuild/bazel/issues/17916). Fortunately, that has been fixed in https://github.com/bazelbuild/bazel/commit/9575c7fa86458717c154f08ff8a7f7aff000f6ae.

If you want to test out the improved code which doesn't do the configuration expansion in a verion of bazel which has these fixes, build or obtain a bazel which includes https://github.com/bazelbuild/bazel/commit/9575c7fa86458717c154f08ff8a7f7aff000f6ae and change https://github.com/bazel-contrib/target-determinator/blob/9035fc670cdfc909304b208205fb7af0666f1bb0/pkg/hash_cache.go#L40-L43 to return true.

Realistically, if the configuration-aware-dependencies code works, I'd strongly favour our plan being to just rely on it as soon as we can (and put in a hack or two in the short term if we need to), rather than putting any complex code in place supporting the configuration over-expansion logic.

purkhusid commented 1 year ago

Thanks for the context. I tried using 7.0.0-pre.20230816.3 which should include the fix but now I get an error:

failed to process change: failed to calculate hashes at revision 'before' (sha: 8f0056b38e02a1491ff63eba887d9c7ef4eee571): failed to hash configuredRuleInput //src/dotnet/libraries/FSharp.GrpcCodeGenerator/FSharp.GrpcCodeGenerator {} which is a dependency of //src/dotnet/libraries/FSharp.GrpcCodeGenerator/FSharp.GrpcCodeGenerator:publish_wrapped 0c912defb800eafdffa835bc516bd2956b69827b9c3b1b5bc450344468ddae74: label //src/dotnet/libraries/FSharp.GrpcCodeGenerator/FSharp.GrpcCodeGenerator configuration {} not found in contxt: label not found in context

Not sure if that indicates that there might be something wrong with how my rules use transitions or if the patch for the missing configurations still has some holes in it. Have you seen similar errors while using the configuration aware logic?

illicitonion commented 1 year ago

Can you try applying:

diff --git a/pkg/hash_cache.go b/pkg/hash_cache.go
index e575642..3df1c5f 100644
--- a/pkg/hash_cache.go
+++ b/pkg/hash_cache.go
@@ -492,6 +504,8 @@ func hashRule(thc *TargetHashCache, rule *build.Rule, configuration *analysis.Co
        hasher.Write(protoBytes)
    }

+   ownConfiguration := NormalizeConfiguration(configuration.GetChecksum())
+
    // Hash rule inputs
    if thc.bazelVersionSupportsConfiguredRuleInputs {
        for _, configuredRuleInput := range rule.ConfiguredRuleInput {
@@ -500,6 +514,13 @@ func hashRule(thc *TargetHashCache, rule *build.Rule, configuration *analysis.Co
                return nil, fmt.Errorf("failed to parse configuredRuleInput label %s: %w", configuredRuleInput.GetLabel(), err)
            }
            ruleInputConfiguration := NormalizeConfiguration(configuredRuleInput.GetConfigurationChecksum())
+           if ruleInputConfiguration.String() == "" {
+               if _, ok := thc.context[ruleInputLabel][ownConfiguration]; ok {
+                   ruleInputConfiguration = ownConfiguration
+               } else if _, ok := thc.context[ruleInputLabel][ruleInputConfiguration]; !ok {
+                   return nil, fmt.Errorf("configuredRuleInputs for %s included %s in configuration %s but it couldn't be found either unconfigured or in the depending target's configuration %s", rule.GetName(), ruleInputLabel, ruleInputConfiguration, ownConfiguration)
+               }
+           }
            ruleInputHash, err := thc.Hash(LabelAndConfiguration{Label: ruleInputLabel, Configuration: ruleInputConfiguration})
            if err != nil {
                return nil, fmt.Errorf("failed to hash configuredRuleInput %s %s which is a dependency of %s %s: %w", ruleInputLabel, ruleInputConfiguration, rule.GetName(), configuration.GetChecksum(), err)
purkhusid commented 1 year ago

It looks like this works. At least no errors anymore.

purkhusid commented 1 year ago

Do you want me to create a PR with this fix or do you want to do so yourself?

illicitonion commented 1 year ago

I'll put one together shortly :) Thanks so much!