Does this issue reproduce with the latest releases of all the above?
Untested
What operating system and processor architecture are you using?
Ubuntu 22.04 x86_64
Any other potentially useful information about your toolchain?
Remote execution
What did you do?
We have a remote execution set up with two execution platforms (registered with --host_platform and --extra_execution_platforms). There is a Go toolchain registered and compatible with both execution platforms, however only the second execution platform has a compatible cpp toolchain registered.
Execution platform | Available toolchains
----------------------+----------------------------------------------------------------------------
1. A | @io_bazel_rules_go//go:toolchain
2. B | @io_bazel_rules_go//go:toolchain, @bazel_tools//tools/cpp:toolchain_type
What we observed is that rules_go will try to use the cpp toolchain, that is only compatible with the second execution platform, in an action running on the first one, failing because the cpp toolchain isn't installed on that platform.
Note the execution platform ordering is important, the error we get is because Bazel prefers the first one if it thinks it is compatible with the action.
What did you expect to see?
Successful build
What did you see instead?
ERROR: /var/lib/blah/bazel/a8584ebfb3d6ff0dfe61abfbfa5bb4d3/external/io_bazel_rules_go/BUILD.bazel:42:7: GoStdlib external/io_bazel_rules_go/stdlib_/pkg failed: (Exit 1): builder failed: error executing GoStdlib command (from target @@io_bazel_rules_go//:stdlib)
...
cgo: C compiler "/usr/bin/clang-13" not found: exec: "/usr/bin/clang-13": stat /usr/bin/clang-13: no such file or directory
Discussion
The underlying cause in this case is that the stdlib target depends on the cgo_context_data target here, and cgo_context_data has a dependency on the cpp toolchain here, so its execution platform is constrained to the platforms compatible with the selected toolchain, but instead of executing the compiler it returns the path to it in its provider, here.
In this case the rule that actually executes the compiler is stdlib, but that has no dependency on the cpp toolchain so Bazel doesn't know it has to run on the a platform that is compatible with the cpp toolchain. So it defaults to the first platform and then fails because /usr/bin/clang-13 doesn't exist.
The patch to rules_go we have used is to add the toolchain dependency to all the rules that depend on cgo_context_data, so they also have their execution platform constrained to the platforms compatible with the selected cpp toolchain. (https://github.com/bazelbuild/rules_go/pull/4128)
An even better fix would be to make cgo_context_data a toolchain itself so that it influences the execution platform of the rules that depend on it, but that would be a bigger diff.
What version of rules_go are you using?
v0.47
What version of gazelle are you using?
v0.36.0
What version of Bazel are you using?
7.3.0
Does this issue reproduce with the latest releases of all the above?
Untested
What operating system and processor architecture are you using?
Ubuntu 22.04 x86_64
Any other potentially useful information about your toolchain?
Remote execution
What did you do?
We have a remote execution set up with two execution platforms (registered with
--host_platform
and--extra_execution_platforms
). There is a Go toolchain registered and compatible with both execution platforms, however only the second execution platform has a compatible cpp toolchain registered.What we observed is that
rules_go
will try to use the cpp toolchain, that is only compatible with the second execution platform, in an action running on the first one, failing because the cpp toolchain isn't installed on that platform.Note the execution platform ordering is important, the error we get is because Bazel prefers the first one if it thinks it is compatible with the action.
What did you expect to see?
Successful build
What did you see instead?
Discussion
The underlying cause in this case is that the
stdlib
target depends on thecgo_context_data
target here, andcgo_context_data
has a dependency on the cpp toolchain here, so its execution platform is constrained to the platforms compatible with the selected toolchain, but instead of executing the compiler it returns the path to it in its provider, here.In this case the rule that actually executes the compiler is
stdlib
, but that has no dependency on the cpp toolchain so Bazel doesn't know it has to run on the a platform that is compatible with the cpp toolchain. So it defaults to the first platform and then fails because/usr/bin/clang-13
doesn't exist.The patch to
rules_go
we have used is to add the toolchain dependency to all the rules that depend oncgo_context_data
, so they also have their execution platform constrained to the platforms compatible with the selected cpp toolchain. (https://github.com/bazelbuild/rules_go/pull/4128)An even better fix would be to make
cgo_context_data
a toolchain itself so that it influences the execution platform of the rules that depend on it, but that would be a bigger diff.