bazelbuild / continuous-integration

Bazel's Continuous Integration Setup
https://buildkite.com
Apache License 2.0
259 stars 135 forks source link

Multiple downstream failures due to `--incompatible_use_platforms_repo_for_constraints` #1404

Open Wyverald opened 2 years ago

Wyverald commented 2 years ago

Failures: https://buildkite.com/bazel/bazel-at-head-plus-downstream/builds/2586#0182a995-9a6f-49fc-a38d-06487e43e242

Flag flip commit: https://github.com/bazelbuild/bazel/commit/3469784a4935c91b4ac22bcfed6be52e6dfec878

Autosheriff link: https://buildkite.com/bazel/bazel-auto-sheriff-face-with-cowboy-hat/builds/1005

Projects to fix:

aiuto commented 2 years ago

Interesting. The configurability team will have to look into these. Even if we were to roll it back, we still want the change for Bazel 6.x, so fixing the rules is the right choice.

On Wed, Aug 17, 2022 at 10:16 AM Xùdōng Yáng @.***> wrote:

Failures: https://buildkite.com/bazel/bazel-at-head-plus-downstream/builds/2586#0182a995-9a6f-49fc-a38d-06487e43e242

Flag flip commit: @.*** https://github.com/bazelbuild/bazel/commit/3469784a4935c91b4ac22bcfed6be52e6dfec878

Autosheriff link: https://buildkite.com/bazel/bazel-auto-sheriff-face-with-cowboy-hat/builds/1005

Projects to fix:

  • Bazel
  • Bazel integration testing
  • Bazelisk
  • Buildfarm
  • Cartographer
  • Envoy
  • Flatbuffers
  • Kythe
  • Protobuf: sent protocolbuffers/protobuf#10423 https://github.com/protocolbuffers/protobuf/pull/10423
  • TensorFlow
  • rules_cc
  • rules_foreign_cc
  • rules_go
  • rules_groovy
  • rules_jvm_external
  • rules_jvm_external - examples
  • rules_kotlin
  • rules_nodejs
  • rules_proto
  • rules_python
  • rules_rust
  • rules_sass
  • upb

— Reply to this email directly, view it on GitHub https://github.com/bazelbuild/continuous-integration/issues/1404, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAXHHHBNYN6VYDGVXJIKEKDVZTX5RANCNFSM56ZXGBVA . You are receiving this because you are subscribed to this thread.Message ID: @.***>

aranguyen commented 2 years ago

@Wyverald A good number of the rest of the projects are caused by the usage of older bazel-toolchains which is linked to RBE. https://github.com/bazelbuild/rules_proto/pull/134 is an example of removing deprecated rbe_autoconfig and migrating to rbe_preconfig. In the same PR, I also removed bazel-toolchains because there is no other usage. If there are, then updating it to 5.1.2 & use rbe_preconfig is the correct thing to do. I need to divert my attention to something else for the next few hours so I just want to share this in case you get to them before me. I will be back. Thanks!

Wyverald commented 2 years ago

thanks for the update!

aranguyen commented 2 years ago

Just another update, I've sent prs for the rest except for rules_kotlin rules_nodejs, rules_cc and TensorFlow. @Wyverald is there anything on this list that you're actively working on?

Wyverald commented 2 years ago

Thanks, Ara. I've been busy with other stuff so haven't been sending new PRs for anything else.

For the other PRs you sent, could you mention this issue in them so I can keep track?

aranguyen commented 2 years ago

Okay sounds good. I will make sometime today to work on the rest then. I sent these

  1. rules_go https://github.com/bazelbuild/rules_go/pull/3272
  2. rules_foreign_cc https://github.com/bazelbuild/rules_foreign_cc/pull/952
  3. rules_rust https://github.com/bazelbuild/rules_rust/pull/1524
  4. rules_proto https://github.com/bazelbuild/rules_proto/pull/134
  5. rules_groovy https://github.com/bazelbuild/rules_groovy/pull/62

The above were all merged. The followings are still open

  1. rules_jvm/rules_jvm_external https://github.com/bazelbuild/rules_jvm_external/pull/730
  2. rules_sass https://github.com/bazelbuild/rules_sass/pull/143
  3. rules_nodejs https://github.com/bazelbuild/rules_nodejs/pull/3541
alexeagle commented 2 years ago

@aranguyen the commit in rules_nodejs didn't actually enable the flag https://github.com/bazelbuild/rules_nodejs/pull/2536/files#diff-544556920c45b42cbfe40159b082ce8af6bd929e492d076769226265f215832fR56

I'll try again now...

aranguyen commented 2 years ago

@alexeagle that comment I made on your pr previously is a bit ancient. I think it was during the time when I updated the external deps for bazel so that is obsolete. The failure for rules_nodejs we're seeing now should be addressed by this pr https://github.com/bazelbuild/rules_nodejs/pull/3541 . Similar reason mentioned here https://github.com/bazelbuild/continuous-integration/issues/1404#issuecomment-1219473068

aranguyen commented 2 years ago

Update:

1) rules_proto I sent an additional pr https://github.com/bazelbuild/rules_proto/pull/137 2) rules_kotlin has these remaining references

aranguyen-macbookpro:rules_kotlin aranguyen$ grep -r "@bazel_tools//platforms" /private/var/tmp/_bazel_aranguyen/296329cf59a3bbb8fa169f1cac45cd9a
/private/var/tmp/_bazel_aranguyen/296329cf59a3bbb8fa169f1cac45cd9a/external/rules_jvm_external/examples/android_instrumentation_test/BUILD:        "@bazel_tools//platforms:x86_64",
/private/var/tmp/_bazel_aranguyen/296329cf59a3bbb8fa169f1cac45cd9a/external/rules_jvm_external/examples/android_instrumentation_test/BUILD:        "@bazel_tools//platforms:linux",

I need to wait for this PR in rules_jvm_external to be reviewed and merged first before I can update rules_kotlin 3) rules_cc : I am not able to build it on my machine for another reason so I filed this issue instead https://github.com/bazelbuild/rules_cc/issues/140 . It has my analysis and what's needed to be done. I will try again tomorrow. Hopefully I receive some guidance on the build issue and I can help with the update as well.

I also saw an internal cl for Tensorflow so at this point all failed downstream deps have been looked at and actions were taken. Please let me know if I'm missing anything.

mai93 commented 2 years ago

I updated bazel_skylib version in rules_jvm_external https://github.com/bazelbuild/rules_jvm_external/pull/742 but some of tests are still failing with this error

(02:23:26) ERROR: /var/lib/buildkite-agent/.cache/bazel/_bazel_buildkite-agent/627400a4322f538dd0c1a564239629b9/external/bazel_tools/third_party/ijar/BUILD:47:11: error loading package '@bazel_tools//third_party/zlib': Unable to find package for @rules_license//rules:license.bzl: The repository '@rules_license' could not be resolved: Repository '@rules_license' is not defined. and referenced by '@bazel_tools//third_party/ijar:zlib_client'   Does anyone have an idea what needs to be updated to fix it?

Wyverald commented 2 years ago

I have a pending CL for that.

Wyverald commented 2 years ago

As of today, the following projects remain broken due to this issue (downstream pipeline link):

The other breakages in downstream are unrelated and need to be fixed separately.

aranguyen commented 2 years ago

@Wyverald thanks for letting me know. I'll take a look at rules_cc, jvm_external, and rules_kotlin again to see why the prs I sent out were not sufficient enough. For rules_sass, we have a pending pr that has been sitting there for a long time unreviewed https://github.com/bazelbuild/rules_sass/pull/143

it possible to do a clean --expunge beforehand when running the downstream pipeline? I remember there was a discussion about how rbe_preconfig is fetching something in a non-deterministic way?

Wyverald commented 2 years ago

I believe we effectively already do clean --expunge (we start a new docker instance for each job). rbe_preconfig is special in that it fetches from a URL whose contents change; I don't remember how it affected CI.

Wyverald commented 2 years ago

Worth noting, rules_go (on all platforms except RBE) also seems to be broken due to this change, even though the error message doesn't look the same: https://buildkite.com/bazel/bazel-at-head-plus-downstream/builds/2618#01831d43-1066-44ad-9289-f8b72b54f294

In a downstream run with the flag unflipped, rules_go is fine: https://buildkite.com/bazel/bazel-at-head-plus-downstream/builds/2620#01831d5e-da23-40c7-84fe-9b9d5a6244ee

meteorcloudy commented 2 years ago

@aranguyen Given this flag flipped has caused our downstream pipeline to be red for a long time, do you mind we first rollback the flag flip? You can still monitor the downstream projects status in https://buildkite.com/bazel/bazelisk-plus-incompatible-flags after unflip the flag. This will help us keep downstream green and catch actual Bazel bugs.

aranguyen commented 2 years ago

@meteorcloudy we actually want the flag flip for the 6.0 release. If we rollback the flag flip, when would be the latest we can roll forward the change for 6.0?

Wyverald commented 2 years ago

flatbuffers: upgrading grpc to 1.48.0 wasn't enough; we need a version of grpc that contains https://github.com/grpc/grpc/commit/27509c345cb8a4895bf3a93cb4014d012d4a0225, and as of today there's no stable release containing it yet

aranguyen commented 2 years ago

I submit an internal cl to flip the default value of --incompatible_use_platforms_repo_for_constraints to false for now. Two more prs are sent out: