pantsbuild / pants

The Pants Build System
https://www.pantsbuild.org
Apache License 2.0
3.25k stars 625 forks source link

vcs_version fails when building pex binary in a docker environment #21178

Closed kilogram closed 1 week ago

kilogram commented 1 month ago

Describe the bug

vcs_version targets fail to build when pointed at a docker_environment: ('ERROR: no version found for', Namespace(root='/u/kilogram/src/loose-cannon', config='pyproject.synthetic.toml', strip_dev=False, command=None))

The __run.sh looks the same regardless of the environment:

#!/usr/bin/env bash
# This command line should execute the same process as pants did internally.
cd /tmp/pants-sandbox-AGNZdd
env -i  ./setuptools_scm.pex_pex_shim.sh --root /u/kilogram/src/loose-cannon --config pyproject.synthetic.toml

I'm inferring from this that setuptools itself is not being run in the docker container.

The relevant portions of BUILD look like:

__defaults__({
  pex_binary: dict(environment="docker"),
  docker_image: dict(build_platform=["linux/amd64"]),
})

vcs_version(
    name = "_version",
    generate_to="libs/_version.py",
    template='vcs_version = "{version}"',
)

docker_environment(name="docker", platform="linux_x86_64", ...)
pex_binary(..., environment="docker")

I observed that when the pex_binary's environment is "docker", the sandbox has broken .cache symlink:

$ ls -al /tmp/pants-sandbox-AGNZdd/.cache/pex_root
lrwxrwxrwx 1 kilogram kilogram 28 Jul 16 16:35 /tmp/pants-sandbox-AGNZdd/.cache/pex_root -> /pants-named-caches/pex_root

This of course succeeds when the pex_binary's environment is overriden to a local_environment:

$ ls -al /tmp/pants-sandbox-7uPeuH/.cache/pex_root
lrwxrwxrwx 1 kilogram kilogram 46 Jul 16 16:39 /tmp/pants-sandbox-7uPeuH/.cache/pex_root -> /u/kilogram/.cache/pants/named_caches/pex_root

Pants version Which version of Pants are you using?

2.22.0a0

OS Are you encountering the bug on MacOS, Linux, or both?

Linux edi 6.9.6-arch1-1 #1 SMP PREEMPT_DYNAMIC Fri, 21 Jun 2024 19:49:19 +0000 x86_64 GNU/Linux ldd (GNU libc) 2.39 cp311-cp311-manylinux_2_39_x86_64

Additional info Add any other information about the problem here, such as attachments or links to gists, if relevant.

benjyw commented 1 month ago

I assume this is because the docker container doesn't have the local git state available?

kilogram commented 1 month ago

Yes, I believe that's right, from poking around the docker container yesterday and running the command myself.

Is it possible to run the vcs_version target outside the docker container unconditionally? Or to make the local git state available as a volume?

benjyw commented 1 month ago

I think you're correct that this must always run locally, regardless of environment settings, so this is indeed a bug.

benjyw commented 1 month ago

It is possible to force the process to run locally, I will take a look at doing so.

benjyw commented 1 month ago

@kilogram can you create a simple repo that demonstrates the issue?

Or alternatively, can you patch https://github.com/pantsbuild/pants/pull/21188 into the Pants repo, build Pants, and try it out on your use case? The simplest way to do this is to clone https://github.com/pantsbuild/pants, patch the change in, and run PANTS_SOURCE=path/to/pantsbuild/pants/repo pants ... in your repo. The env var tells pants to build and run itself from the repo at that path.

kilogram commented 1 month ago

Works like a charm!

I tested with this repro: https://github.com/kilogram/pants-21178-repro/blob/8fd6bb393f6c5690f8427dc62f8f344818531b16/pants.toml#L45

benjyw commented 1 month ago

Fixed in https://github.com/pantsbuild/pants/pull/21188

benjyw commented 1 month ago

Thanks for the report @kilogram , this will be in the next release.

kilogram commented 1 month ago

Thank you!

kilogram commented 1 month ago

Argh, I'm sorry, I missed something. The vcs_version indeed runs, but the version it gets is null when built in the docker environment. Can be verified even without starting a docker container with pants run libs:print_version in my repro repo. The behaviour inside the docker container made with pants package is the same.

{apricot} edi [11:31:41] » pants run libs:print_version
11:31:47.71 [WARN] System path set by ASDF configured by `[python-bootstrap].search_path` is unsupported, ignoring. This version will not be considered when determining which Python interpreters to use. Please remove 'system' from `/u/kilogram/.tool-versions` to disable this warning.
11:31:47.79 [INFO] Canceled: Run setuptools_scm for libs:_version
11:31:47.87 [INFO] Completed: Run setuptools_scm for libs:_version
PEX=/tmp/pants-sandbox-pGTkPm/libs/print_version.pex
BinaryVersion(release_tag=None, next_version='0.1', dev_distance='3', revision_hash='b9aa2ed', dirty_workdir='20240725')
{apricot} edi [11:31:49] » ls
BUILD  dist  docker  libs  pants.toml
{apricot} edi [11:31:51] » vim pants.toml
{apricot} edi [11:31:53] » pants run libs:print_version
11:31:59.80 [INFO] Initializing scheduler...
11:31:59.87 [INFO] Initializing Nailgun pool for 40 processes...
11:32:02.37 [INFO] Scheduler initialized.
11:32:02.45 [WARN] The `run` goal was called with target `libs:print_version`, which specifies the environment `linux`, which is a `docker_environment`. The `run` goal only runs in the local environment. You may experience unexpected behavior.
11:32:03.78 [INFO] Completed: Run setuptools_scm for libs:_version
11:32:04.72 [INFO] Completed: Building libs/print_version.pex
PEX=/tmp/pants-sandbox-0BUYRI/libs/print_version.pex
BinaryVersion(release_tag=None, next_version='0.1', dev_distance='3', revision_hash='b9aa2ed', dirty_workdir=None)

I don't see an option to re-open this bug; should I create a new one?

benjyw commented 1 month ago

I've reopened. I will look at the repro.

benjyw commented 1 month ago

@kilogram what is an example of a working case? If I run pants run libs:print_version with linux = "//:local_linux" and with linux = "//:docker" I get release_tag=None in both cases. So what is the expected behavior?

kilogram commented 1 month ago

Sorry about that. I've pushed a change to easily repro. At head in the repro directory:

$ pants run docker/print_version
...
BinaryVersion(release_tag='', next_version=None, dev_distance=None, revision_hash=None, dirty_workdir=None)

Poking around the container, the produced _version.py is empty.

As opposed to:

$ pants run libs:print_version
...
BinaryVersion(release_tag=None, next_version='0.1', dev_distance='4', revision_hash='3c810ef', dirty_workdir=None)
benjyw commented 1 month ago

I can't pull the docker image, as it's hosted on some private repo

benjyw commented 1 month ago

@kilogram I will continue to look at this if you can make the example work without the private repo?

kilogram commented 1 month ago

Updated the repro repo, sorry about that! I am getting a different error now, even with PANTS_SOURCE set exactly to the same commit: VCSVersioningError: Trying to determine the version for the libs:_version target at libs:_version, but you are not running in a git worktree.

benjyw commented 1 month ago

Hmm, it cannot find your .git state. You're definitely running in a git worktree?

kilogram commented 1 month ago

Yup. It works with the local_linux environment:

{apricot} edi [17:20:43] » PANTS_SOURCE=$HOME/src/pants pants run docker/print_version
Pantsd has been turned off via Env.
17:20:47.01 [INFO] Initializing Nailgun pool for 40 processes...
17:20:49.69 [INFO] Initializing Nailgun pool for 40 processes...
17:20:53.92 [INFO] Completed: Run setuptools_scm for libs:_version
17:20:53.99 [INFO] Completed: Building docker image print_version/print_version:0.1-813dcd3feb41cd76cad0e12e3e1264c1cc5d2def371cb6113388daac305165cb
PEX=/app/print_version.pex
BinaryVersion(release_tag=None, next_version='0.1', dev_distance='5', revision_hash='629ebb3', dirty_workdir='20240729')
{apricot} edi [17:20:58] » git status
On branch main
Your branch is up to date with 'origin/main'.

Changes not staged for commit:
  (use "git add <file>..." to update what will be committed)
  (use "git restore <file>..." to discard changes in working directory)
    modified:   pants.toml

no changes added to commit (use "git add" and/or "git commit -a")
{apricot} edi [17:20:59] » git diff
diff --git a/pants.toml b/pants.toml
index 35a19aa..1bcbb8c 100644
--- a/pants.toml
+++ b/pants.toml
@@ -41,7 +41,7 @@ default_repository = "{directory}/{name}"
 extra_env_vars = ["HOME"]

 [environments-preview.names]
-# linux = "//:local_linux"  # <-- Works
-linux = "//:docker"  # <-- breaks
+linux = "//:local_linux"  # <-- Works
+# linux = "//:docker"  # <-- breaks
 local_osx = "//:local_osx"
 docker = "//:docker"

The only change is to revert the local changes, and it fails (with the docker environment):

edi [17:24:33] » git checkout -- pants.toml
edi [17:24:33] » git status
## main...origin/main
edi [17:24:37] » PANTS_SOURCE=$HOME/src/pants pants run docker/print_version
Pantsd has been turned off via Env.
17:24:39.17 [INFO] Initializing Nailgun pool for 40 processes...
17:24:41.79 [INFO] Initializing Nailgun pool for 40 processes...
17:24:45.78 [ERROR] 1 Exception encountered:

Engine traceback:
  in `run` goal

VCSVersioningError: Trying to determine the version for the libs:_version target at libs:_version, but you are not running in a git worktree.
benjyw commented 1 month ago

I'll dig in, I see you've pushed the image source fix

benjyw commented 1 month ago

Hmm, that error happens even if I revert my change...

benjyw commented 1 month ago

Ah, because the process that finds git is still running in the container... 🤦

benjyw commented 1 month ago

FYI the docker image you provided (python:3.9.14-slim-bullseye) isn't compatible with your interpreter constraints (CPython>=3.11,<3.12)

benjyw commented 1 month ago

@kilogram can you patch in https://github.com/pantsbuild/pants/pull/21231 and see if that gives you what you expect?

benjyw commented 1 month ago

Also, incidentally, while debugging this I fixed this: https://github.com/pantsbuild/pants/pull/21232

kilogram commented 1 month ago

Pushed a fix to the repro, but still no joy; getting an empty version string:

edi [10:56:04] » PANTS_SOURCE=$HOME/src/pants pants run docker/print_version
Pantsd has been turned off via Env.
10:56:07.33 [INFO] Initializing Nailgun pool for 40 processes...
10:56:09.88 [INFO] Initializing Nailgun pool for 40 processes...

10:56:14.38 [INFO] Completed: Run setuptools_scm for libs:_version
10:56:14.42 [INFO] Completed: Building docker image print_version/print_version:0.1-ffb8344a84d520c5a10d768ceb2591e032d73cd0692f630a524dcf1932998d47
PEX=/app/print_version.pex
BinaryVersion(release_tag='', next_version=None, dev_distance=None, revision_hash=None, dirty_workdir=None)
benjyw commented 1 month ago

Hmm, that may be a separate issue. Ignoring the docker/print_version part, does run libs:print_version work now? And did it before? I would have expected that to previously fail when run in a docker environment. So let's at least figure out that part, i.e., is a correct _version.py being generated when running with the docker environment now, and was it before?

benjyw commented 1 month ago

Ugh, of course the python interpreter discovery also has to run locally...

kilogram commented 1 month ago

FWIW, PANTS_SOURCE=$HOME/src/pants pants run libs:print_version always returns a version, in any environment (and from upstream pants): BinaryVersion(release_tag=None, next_version='0.1', dev_distance='6', revision_hash='8513445', dirty_workdir='20240730')

kilogram commented 1 month ago

Oh! In the docker environment, pants run libs:print_version prints the version, but pants package libs:print_version and then running the PEX does not work.

They both work as intended in the local environment. Keeping the sandboxes and unzipping the PEXes, the one produced by run has a non-empty _version.py, but the one produced by package has an empty one.

I hope you're not as lost as I am :sweat_smile:

benjyw commented 1 month ago

And all of that is with which set of changes?

kilogram commented 1 month ago

This is with your changes to Run git locally... (0842d689a3).

benjyw commented 4 weeks ago

Update that I am still trying to figure this one out. I have been sidetracked debugging some engine issue that is blocking the fix.

benjyw commented 1 week ago

@kilogram can you patch in #21353 and see if that is satisfactory?

kilogram commented 1 week ago

It all works! Thank you!

benjyw commented 1 week ago

Awesome! This will be cherry-picked to 2.22.x, which will have an RC today and a stable release soon.