aspect-build / rules_js

High-performance Bazel rules for running Node.js tools and building JavaScript projects
https://docs.aspect.build/rules/aspect_rules_js
Apache License 2.0
306 stars 105 forks source link

[Bug]: js_binary launcher script not portable from host to exec platform #1168

Open alexeagle opened 1 year ago

alexeagle commented 1 year ago

What happened?

When RBE is used for cross-compilation, the host platform may be different from the exec platform.

In my case I have a linux_x86 host platform, so the launcher created by ctx.actions.expand_template here https://github.com/aspect-build/rules_js/blob/92a36f314b7841475e12a68dcd018c088f373bc2/js/private/js_binary.bzl#L481-L486 will create a file with a node path pointing to the host-resolved toolchain, with linux_x86 arch.

Now, I enable RBE and the exec platform is linux_arm64. The launcher script is copied to the remote and tries to spawn node for the wrong arch, which of course fails with executable format errorcannot execute binary file ...nodejs_linux_amd64...`

Version

Bazel 6.2.1, latest of rules_js

How to reproduce

Tricky since you need an RBE setup with alternate architecture.

Any other information?

No response

alexeagle commented 1 year ago

Studied this with @gregmagolan

Let's look at the output of bazel aquery //some:build_smoke_test --config=rbe --config=aarch64 where those config flags enable the cross-platform RBE behavior:

action 'Expanding template some/build_smoke_test.sh'
  Mnemonic: TemplateExpand
  Configuration: k8-fastbuild-aarch64
  Execution platform: //tools/platforms:linux_x86_jetpack5
...
  Substitutions: [
    {{{node}}: my_workspace/../nodejs_linux_amd64/bin/nodejs/bin/node}

...

runfiles for //some:build_smoke_test
  Mnemonic: Middleman
  Target: //some/aerial/frontend:build_smoke_test
  Configuration: k8-fastbuild-aarch64
  Execution platform: //tools/platforms:linux_x86_jetpack5
  ActionKey: 709e80c88487a2411e1ee4dfb9f22a861492d20c4765150c0c794abd70f8147c
  Inputs: [..., external/nodejs_linux_amd64/bin/nodejs/bin/node]

action 'Testing //some:build_smoke_test'
  Mnemonic: TestRunner
  Target: //some:build_smoke_test
  Configuration: k8-fastbuild-aarch64
  Execution platform: //tools/platforms:linux_aarch64_jetpack5
  Command Line: (exec external/bazel_tools/tools/test/test-setup.sh \

What we see here is that even if we fixed the {{node}} template variable we put in the launcher, we would still have the wrong nodejs executable in the runfiles for the test, because the "middleman" action which generates the runfiles has an x86 exec platform. That makes this seem like a Bazel limitation with cross-platform RBE.

alexeagle commented 1 year ago

I think it's just a general problem that cross-platform RBE doesn't work with platform-specific inputs that come from runfiles.

fmeum commented 1 year ago

I don't think the execution platform for the middleman action matters, copying runfiles into the final location is a completely platform independent action. At first glance this looks like https://bazelbuild.slack.com/archives/CA31HN1T3/p1690184400360329?thread_ts=1690176577.746239&cid=CA31HN1T3: You may need to define an additional toolchain that matches on the target platform, not the exec platform.

alexeagle commented 1 year ago

How is the target platform relevant here? This is a script used in a build action.

fmeum commented 1 year ago

It's a script that references a binary obtained from the toolchain, both by substituting in its path and adding its files to runfiles. But as far as I can tell, there are always two Node toolchains of the same type, one with a target constraint and one with an exec constraint: https://github.com/bazelbuild/rules_nodejs/blob/cc742d3b02c95eb56fce241c8fff6605d9e9c315/nodejs/private/toolchains_repo.bzl#L105-L116

This can cause this problem if the exec platform is linux_arm64 and a js_binary is built in the exec configuration (that is, for linux_arm64), as then the linux_amd64 toolchain with the exec constraint for linux_amd64 can end up being selected.

This could be solved by having a second, distinct toolchain type for Node runtimes with target constraints, similar to what the native Java toolchains do.