bazelbuild / bazel

a fast, scalable, multi-language and extensible build system
https://bazel.build
Apache License 2.0
23.2k stars 4.06k forks source link

Host agnostic absolute tool path for cross compilation #14569

Closed erenon closed 1 year ago

erenon commented 2 years ago

Description of the problem / feature request:

cc_toolchain_config.tool_paths lets the user to specify paths of non-/hermetic tools to be used. If the specified path is absolute (non-hermetic), is is used as-is, otherwise (hermetic) a prefix is prepended to it before execution. On linux, a path is absolute, if it starts with /, on windows, if it starts with a drive letter.

However, if I want to use remote execution, to launch a compilation from a linux machine, to be executed on a remote windows machine, the tool path is going to be something like C:/Program Files/.../cl.exe, and bazel running on linux will recognise that as a relative path, breaking the remote execution.

Feature requests: what underlying problem are you trying to solve with this feature?

I'd like to be able to start a remote C++ compilation from linux, to be executed on windows. The windows toolchain is non-hermetic.

Bugs: what's the simplest, easiest way to reproduce this bug? Please provide a minimal example if possible.

Will provide one if really needed.

What operating system are you running Bazel on?

Linux, remote executor running on windows.

What's the output of bazel info release?

release master (@71cd4866001d281fa60f294ee8108bf05e24e8af)

Have you found anything relevant by searching the web?

CcToolchainProviderHelper.java:385 calls toolpath.getRelative(prefix), that causes the issue. getRelative considers the path specification of the local host only.

gregestren commented 2 years ago

@meteorcloudy did you have any quick insights or issue references for this re: Windows vs. Linux path differences?

meteorcloudy commented 2 years ago

Yeah, cross-compilation between Unix and Windows (host vs execution) just currently doesn't work in Bazel.

The reason is that we're abusing [OS.getCurrent() != OS.WINDOWS](https://cs.opensource.google/search?q=%22OS.getCurrent()%20!%3D%20OS.WINDOWS%22&ss=bazel%2Fbazel&start=11) in our code. To fix this, we need to figure out for each place if it should use host/execution/target platform as "current platform". This won't be an easy task, because often the host/execution/target platform info isn't easily accessible in the context.

gregestren commented 2 years ago

@meteorcloudy I just re-categorized the team label, probably incorrectly. Do you mind setting that and the P1|2|3 label accurately?

meteorcloudy commented 2 years ago

Sure, this is something we should really consider fixing when we have the capacity.

meteorcloudy commented 2 years ago

BTW, I think this is more related to remote execution?

erenon commented 2 years ago

With a bunch of hacks, I managed to make linux -> windows remote execution work (cc_* rules). I document the kludges for reference, if anyone wants to estimate the effort required.

First, fix windows path handling on linux, basically by copying the getDriveStrLength implementation from windows. This relative breaks linux paths that start with e.g: C:/:

--- a/src/main/java/com/google/devtools/build/lib/vfs/UnixOsPathPolicy.java
+++ b/src/main/java/com/google/devtools/build/lib/vfs/UnixOsPathPolicy.java
@@ -87,6 +87,23 @@ class UnixOsPathPolicy implements OsPathPolicy {
   public int getDriveStrLength(String path) {
-    if (path.length() == 0) {
+    int n = path.length();
+    if (n == 0) {
       return 0;
     }
-    return (path.charAt(0) == '/') ? 1 : 0;
+    char c0 = path.charAt(0);
+    if (c0 == '/') {
+      return 1;
+    }
+    if (n < 3) {
+      return 0;
+    }
+    char c1 = path.charAt(1);
+    char c2 = path.charAt(2);
+    if (isDriveLetter(c0) && c1 == ':' && (c2 == '/' || c2 == '\\')) {
+      return 3;
+    }
+    return 0;
+  }
+
+  private static boolean isDriveLetter(char c) {
+    return ((c >= 'a') && (c <= 'z')) || ((c >= 'A') && (c <= 'Z'));
   }

The remote executor I was using (buildbarn) put the execution root in a dir that wasn't recognized by ShowIncludeFilter: <buildDirectoryPath>/<buildActionHash>/root/external/<repositoryName>/. So had to hack it:

--- a/src/main/java/com/google/devtools/build/lib/rules/cpp/ShowIncludesFilter.java
+++ b/src/main/java/com/google/devtools/build/lib/rules/cpp/ShowIncludesFilter.java
@@ -155,2 +155,6 @@ public class ShowIncludesFilter {

+    // Also grab files under the cross-build windows remote executors base
+    private static final Pattern WIN_EXECROOT_BASE_HEADER_PATTERN =
+        Pattern.compile(".*root\\\\external\\\\(?<headerPath>.*)");
+
     public FilterShowIncludesOutputStream(OutputStream out, String sourceFileName) {
@@ -177,2 +181,9 @@ public class ShowIncludesFilter {
             }
+            else
+            {
+              m = WIN_EXECROOT_BASE_HEADER_PATTERN.matcher(line);
+              if (m.matches()) {
+                line = "external/" + m.group("headerPath");
+              }
+            }
             dependencies.add(line);
@@ -228,3 +239,3 @@ public class ShowIncludesFilter {
       for (String dep : filterShowIncludesOutputStream.getDependencies()) {
-        dependenciesInPath.add(root.getRelative(dep));
+        dependenciesInPath.add(root.getRelative(dep.replace('\\', '/')));
       }

This will make cross compilation work (not testing), if the remote workers are deployed with workspace name differentiators. With a large number of workspaces, that is untenable, so I disabled it. Unfortunately, I do not understand the issue described by the commit that introduces the differentiators: https://github.com/bazelbuild/bazel/commit/b863dbf2d2a327364483c72f1c5e73962af61148

--- a/src/main/java/com/google/devtools/build/lib/rules/cpp/CppCompileAction.java
+++ b/src/main/java/com/google/devtools/build/lib/rules/cpp/CppCompileAction.java
@@ -1562,12 +1562,2 @@ public class CppCompileAction extends AbstractAction implements IncludeScannable

-    if (shouldParseShowIncludes()) {
-      // Hack on Windows. The included headers dumped by cl.exe in stdout contain absolute paths.
-      // When compiling the file from different workspace, the shared cache will cause header
-      // dependency checking to fail. This was initially fixed by a hack (see
-      // https://github.com/bazelbuild/bazel/issues/9172 for more details), but is broken again due
-      // to cl/356735700. We require execution service to ignore caches from other workspace.
-      executionInfo.put(
-          ExecutionRequirements.DIFFERENTIATE_WORKSPACE_CACHE, execRoot.getBaseName());
-    }
-

Now let's see cc_test. TestActionBuilder uses the local platform to determine the execution platform, breaking things. I was looking for a way to get the execution platform, but couldn't find one. Fortunately, the target platform was good enough for me. Deciding whether it is a windows platform or not is also something I couldn't done cleanly, so I'm using a heuristic on the platform name:

--- a/src/main/java/com/google/devtools/build/lib/analysis/BUILD
+++ b/src/main/java/com/google/devtools/build/lib/analysis/BUILD
@@ -334,2 +334,3 @@ java_library(
         ":platform_options",
+        ":platform_configuration",
         ":provider_collection",
--- a/src/main/java/com/google/devtools/build/lib/analysis/test/TestActionBuilder.java
+++ b/src/main/java/com/google/devtools/build/lib/analysis/test/TestActionBuilder.java
@@ -33,2 +33,3 @@ import com.google.devtools.build.lib.analysis.FilesToRunProvider;
 import com.google.devtools.build.lib.analysis.PackageSpecificationProvider;
+import com.google.devtools.build.lib.analysis.PlatformConfiguration;
 import com.google.devtools.build.lib.analysis.PrerequisiteArtifacts;
@@ -180,2 +181,3 @@ public final class TestActionBuilder {
     ArtifactRoot root = ruleContext.getTestLogsDirectory();
+    PlatformConfiguration platformConfiguraion = config.getFragment(PlatformConfiguration.class);

@@ -184,3 +186,4 @@ public final class TestActionBuilder {
     // initialization.
-    final boolean isExecutedOnWindows = OS.getCurrent() == OS.WINDOWS;
+    // let's use the target platform. I can't find the execution platform.
+    final boolean isExecutedOnWindows = platformConfiguraion != null && platformConfiguraion.getTargetPlatform().getName().startsWith("win");
     final boolean isUsingTestWrapperInsteadOfTestSetupScript = isExecutedOnWindows;

Unfortunately, test wrapper programs, required by test execution are only compiled for the host platform. Probably there's a way to make them recompile with remote execution. For this experiment, I just used an ultimate test: Unzipped the required files from a windows bazel build, and added them to the linux bazel soure:

--- a/tools/test/BUILD
+++ b/tools/test/BUILD
@@ -122,3 +122,6 @@ filegroup(
         ],
-        "//conditions:default": [],
+        "//conditions:default": [
+            "tw.exe",
+            "xml.exe",
+        ],
     }),

Finally, linux->windows cc cross compile and test appears to work. There are further issues with sh and py rules.

brentleyjones commented 2 years ago

@EdSchouten FYI about the Buildbarn hack mentioned above.

coeuvre commented 2 years ago

Probably related https://github.com/bazelbuild/bazel/issues/11765#issuecomment-821023996 to get the ShowIncludeFilter work on remote executor.

github-actions[bot] commented 1 year ago

Thank you for contributing to the Bazel repository! This issue has been marked as stale since it has not had any activity in the last 1+ years. It will be closed in the next 14 days unless any other activity occurs or one of the following labels is added: "not stale", "awaiting-bazeler". Please reach out to the triage team (@bazelbuild/triage) if you think this issue is still relevant or you are interested in getting the issue resolved.

tjgq commented 1 year ago

Not stale.

coeuvre commented 1 year ago

Duplicate of https://github.com/bazelbuild/bazel/issues/19208