pantsbuild / pants

The Pants Build System
https://www.pantsbuild.org
Apache License 2.0
3.31k stars 636 forks source link

user-supplied PATH for shell_command overwritten by binary shims path #21136

Closed tdyas closed 1 month ago

tdyas commented 3 months ago

A user testing the in-workspace execution feature (experimental_workspace_environment) reported that the edits to PATH made by the binary shims support (i.e., the tools field on shell_command) interacted poorly with executing Bazel via in-workspace execution. Namely, the edits made to PATH by Pants caused PATH to vary for each Bazel invocation. Given that Bazel incorporates PATH into its cache key, this means that each invocation of Bazel had to rebuild the Bazel target which was being built.

I researched the issue a bit and learned that shell_command was ignoring any user-supplied PATH value and just using the binary shim path.

The user would like to override this behavior and have PATH passed to the process invoked by shell_command / adhoc_tool as it is explicitly configured and without any binary shims modification or other edits.

tdyas commented 3 months ago

Relevant code: https://github.com/pantsbuild/pants/blob/31bd8c15eb2d048b9ec587a13d88b54ebb755ef0/src/python/pants/core/util_rules/adhoc_process_support.py#L202 plus the ExtraSandboxContents, MergeExtraSandboxContents, and AddExtraSandboxContentsToProcess types.

tdyas commented 3 months ago

For shell_command, the relevant code is https://github.com/pantsbuild/pants/blob/fcac40f0da3c1e8586e5f65cf9859f259e8bbaf5/src/python/pants/backend/shell/util_rules/shell_command.py#L148. I verified with the following change that the default case overwrites the PATH completely:

diff --git a/src/python/pants/backend/adhoc/target_types.py b/src/python/pants/backend/adhoc/target_types.py
index 97283dc19c..65080e26d4 100644
--- a/src/python/pants/backend/adhoc/target_types.py
+++ b/src/python/pants/backend/adhoc/target_types.py
@@ -274,6 +274,20 @@ class AdhocToolWorkspaceInvalidationSourcesField(StringSequenceField):
     )

+class AdhocToolIncludeShimsOnPathField(BoolField):
+    alias: ClassVar[str] = "include_shims_on_path"
+    default = True
+    help = help_text(
+        """
+        If True, then modify the `PATH` of the invoked process to include binary shims for the
+        `tools` configured for the target and also any runnable dependencies.
+
+        Set this to False when using in-workspace execution if you want to preserve the
+        configured `PATH` value.
+        """
+    )
+
+
 class AdhocToolTarget(Target):
     alias: ClassVar[str] = "adhoc_tool"
     core_fields = (
@@ -294,6 +308,7 @@ class AdhocToolTarget(Target):
         AdhocToolStdoutFilenameField,
         AdhocToolStderrFilenameField,
         AdhocToolWorkspaceInvalidationSourcesField,
+        AdhocToolIncludeShimsOnPathField,
         EnvironmentField,
     )
     help = help_text(
diff --git a/src/python/pants/backend/shell/target_types.py b/src/python/pants/backend/shell/target_types.py
index 2705efdf37..d12ba4ecce 100644
--- a/src/python/pants/backend/shell/target_types.py
+++ b/src/python/pants/backend/shell/target_types.py
@@ -10,6 +10,7 @@ from pants.backend.adhoc.target_types import (
     AdhocToolDependenciesField,
     AdhocToolExecutionDependenciesField,
     AdhocToolExtraEnvVarsField,
+    AdhocToolIncludeShimsOnPathField,
     AdhocToolLogOutputField,
     AdhocToolNamedCachesField,
     AdhocToolOutputDependenciesField,
@@ -384,6 +385,10 @@ class ShellCommandWorkspaceInvalidationSourcesField(AdhocToolWorkspaceInvalidati
     pass

+class ShellCommandIncludeShimsOnPathField(AdhocToolIncludeShimsOnPathField):
+    pass
+
+
 class SkipShellCommandTestsField(BoolField):
     alias = "skip_tests"
     default = False
@@ -409,6 +414,7 @@ class ShellCommandTarget(Target):
         ShellCommandNamedCachesField,
         ShellCommandOutputRootDirField,
         ShellCommandWorkspaceInvalidationSourcesField,
+        ShellCommandIncludeShimsOnPathField,
         EnvironmentField,
     )
     help = help_text(
diff --git a/src/python/pants/backend/shell/util_rules/shell_command.py b/src/python/pants/backend/shell/util_rules/shell_command.py
index c4b8cf76ed..45f2bc429d 100644
--- a/src/python/pants/backend/shell/util_rules/shell_command.py
+++ b/src/python/pants/backend/shell/util_rules/shell_command.py
@@ -14,6 +14,7 @@ from pants.backend.shell.target_types import (
     ShellCommandCommandField,
     ShellCommandExecutionDependenciesField,
     ShellCommandExtraEnvVarsField,
+    ShellCommandIncludeShimsOnPathField,
     ShellCommandLogOutputField,
     ShellCommandNamedCachesField,
     ShellCommandOutputDirectoriesField,
@@ -114,33 +115,34 @@ async def _prepare_process_request_from_target(
         ),
     )

-    runnable_dependencies = execution_environment.runnable_dependencies
-    extra_sandbox_contents = []
-
-    extra_sandbox_contents.append(
-        ExtraSandboxContents(
-            EMPTY_DIGEST,
-            resolved_tools.path_component,
-            FrozenDict(resolved_tools.immutable_input_digests or {}),
-            FrozenDict(),
-            FrozenDict(),
-        )
-    )
-
-    if runnable_dependencies:
+    extra_sandbox_contents: list[ExtraSandboxContents] = []
+    if shell_command.get(ShellCommandIncludeShimsOnPathField).value:
         extra_sandbox_contents.append(
             ExtraSandboxContents(
                 EMPTY_DIGEST,
-                f"{{chroot}}/{runnable_dependencies.path_component}",
-                runnable_dependencies.immutable_input_digests,
-                runnable_dependencies.append_only_caches,
-                runnable_dependencies.extra_env,
+                resolved_tools.path_component,
+                FrozenDict(resolved_tools.immutable_input_digests or {}),
+                FrozenDict(),
+                FrozenDict(),
             )
         )

+        runnable_dependencies = execution_environment.runnable_dependencies
+        if runnable_dependencies:
+            extra_sandbox_contents.append(
+                ExtraSandboxContents(
+                    EMPTY_DIGEST,
+                    f"{{chroot}}/{runnable_dependencies.path_component}",
+                    runnable_dependencies.immutable_input_digests,
+                    runnable_dependencies.append_only_caches,
+                    runnable_dependencies.extra_env,
+                )
+            )
+
     merged_extras = await Get(
         ExtraSandboxContents, MergeExtraSandboxContents(tuple(extra_sandbox_contents))
     )
+    
     extra_env = dict(merged_extras.extra_env)
     if merged_extras.path:
         extra_env["PATH"] = merged_extras.path
diff --git a/src/python/pants/backend/shell/util_rules/shell_command_test.py b/src/python/pants/backend/shell/util_rules/shell_command_test.py
index beda00d131..4faa97e651 100644
--- a/src/python/pants/backend/shell/util_rules/shell_command_test.py
+++ b/src/python/pants/backend/shell/util_rules/shell_command_test.py
@@ -911,3 +911,40 @@ def test_shell_command_workspace_invalidation_sources(rule_runner: RuleRunner) -
     time.sleep(0.100)
     result3 = execute_shell_command(rule_runner, address)
     assert result1.snapshot != result3.snapshot
+
+
+def test_shell_command_include_shims_on_path(rule_runner: RuleRunner) -> None:
+    rule_runner.write_files({
+        "src/BUILD": dedent(
+            """\
+            shell_command(
+                name="shims_true",
+                tools=["echo"],
+                command="echo $PATH > path.txt",
+                extra_env_vars=["PATH=/bin:/usr/bin"],
+                output_files=["path.txt"],
+                include_shims_on_path=True,
+            )
+            shell_command(
+                name="shims_false",
+                tools=["echo"],
+                command="echo $PATH > path.txt",
+                extra_env_vars=["PATH=/bin:/usr/bin"],
+                output_files=["path.txt"],
+                include_shims_on_path=False,
+            )
+            """
+        )
+    })
+
+    def run(target_name: str) -> str:
+        result = execute_shell_command(rule_runner, Address("src", target_name=target_name))
+        contents = rule_runner.request(DigestContents, [result.snapshot.digest])
+        assert len(contents) == 1
+        return contents[0].content.decode().strip()
+
+    path_true = run("shims_true")
+    assert "/bin:/usr/bin" not in path_true
+
+    path_false = run("shims_false")
+    assert path_false == "/bin:/usr/bin"
tdyas commented 3 months ago

https://github.com/pantsbuild/pants/pull/21138 is a proof of concept fix which introduces a path_shims_mode field on shell_command which controls how the binary shims path is combined with an existing PATH.

tdyas commented 1 month ago

Closing since this issue was solved by the addition in https://github.com/pantsbuild/pants/pull/21138 of the path_env_modify field to adhoc_tool and shell_command target types.