pantsbuild / pants

The Pants Build System
https://www.pantsbuild.org
Apache License 2.0
3.32k stars 636 forks source link

Add ability to extend a PEX lockfile without modifying existing entries #15704

Open danxmoran opened 2 years ago

danxmoran commented 2 years ago

Is your feature request related to a problem? Please describe.

When I add a new python_requirement to my repo, I run ./pants generate-lockfiles. In addition to adding a new entry for the requirement, the goal sometimes updates existing entries in my lockfile (i.e. to bump the patch version of a transitive dependency). Either before or during code review I'll typically end up manually reverting all the "unrelated" changes to keep my commit focused - this is easy to forget to do, so accidental upgrades sometimes creep through.

The fact that generate-lockfiles might update existing entries in-place also prevents us from adding a CI check to assert that the lockfiles are up-to-date - ideally we'd run generate-lockfiles and fail if there was any diff, but we can't do that if there's a chance that unrelated changes might cause the check to fail.

Describe the solution you'd like

I'd like a new option on generate-lockfiles that tells the command to only add new entries to the lock, and not update existing ones. If it's not possible to add a new entry without updating an existing one, I'd like the command to fail with a description of the version conflict.

jsirois commented 2 years ago

@danxmoran this amounts to a feature request against Pex.

Today Pex supports:

  1. pex3 lock update my.lock to update a whole lockfile with latest that meet the original requirement constraints.
  2. pex3 lock update my.lock -p just_update_this_project -p oh_and_this_one_too<2.0 to update individual projects already in the lockfile (say you want to work around a CVE).

Pants does not expose these update features. Really it doesn't expose 2 since 1 is effectively just like re-running pex3 lock create ... with the original requirements.

So Pex would need a new update feature:

  1. pex3 lock update my.lock -a add_this_dep_if_possible_without_updating_the_rest_of_the_lock

Today, when you try to do this using the existing update -p feature you get:

$ pex3 lock create requests -o lock
$ pex3 lock update -p "charset-normalizer<2.0.10" lock --dry-run
Would update charset-normalizer from 2.0.12 to 2.0.9 in lock generated by cp311-cp311-manylinux_2_35_x86_64.
$ pex3 lock update -p ansicolors lock --dry-run
The following updates were requested but there were no matching locked requirements found in lock:
+ ansicolors

Once Pex implements 3, then Pants would need to tackle the issue of lockfile updates and take these features (2 & 3) on board.

danxmoran commented 2 years ago

@jsirois thanks for the context! I opened an issue in the PEX repo to track that piece of it.

stuhood commented 2 years ago

The need for lockfile updates as covered above makes sense.

But as to this point:

The fact that generate-lockfiles might update existing entries in-place also prevents us from adding a CI check to assert that the lockfiles are up-to-date - ideally we'd run generate-lockfiles and fail if there was any diff, but we can't do that if there's a chance that unrelated changes might cause the check to fail.

This could/should probably be covered by a ./pants generate-lockfiles --check mode which validates that the inputs for the lockfile are unchanged (in internals terms, that's having the same "lockfile metadata", which is stored in the header of the lockfile). If that is something that you'd be interested in seeing as well, it's almost certainly an easier problem... should probably just be in a different ticket.

Eric-Arellano commented 2 years ago

This could/should probably be covered by a ./pants generate-lockfiles --check mode which validates that the inputs for the lockfile are unchanged

Yeah, I think that is easy for us to do.

danxmoran commented 2 years ago

A dedicated "check" mode would be great! I'll file another ticket for that

jsirois commented 2 years ago

I missed the bit about the check. FWIW, this exists today over in Pex:

Prep a lock:

$ pex3 lock create requests -o lock

Update it right away; so nothing new: no stdout and exit code 0.

$ pex3 lock update lock --dry-run

Prep a simulated old lock

$ pex3 lock create requests==2 -o lock --indent 2
$ cp lock lock-old-simulation
$ vi lock-old-simulation
$ diff -u lock lock-old-simulation 
--- lock    2022-05-31 18:44:41.198081339 -0700
+++ lock-old-simulation 2022-05-31 18:45:10.407617559 -0700
@@ -32,7 +32,7 @@
   "pex_version": "2.1.92",
   "prefer_older_binary": false,
   "requirements": [
-    "requests==2"
+    "requests"
   ],
   "requires_python": [],
   "resolver_version": "pip-legacy-resolver",

Update it: stdout contains updates and exit code 0

$ pex3 lock update lock-old-simulation --dry-run
Would update requests from 2 to 2.27.1 in lock generated by cp311-cp311-manylinux_2_35_x86_64.

So, that gives you --check via --dry-run with the stdout being what is out of date.

Perhaps Pants just running pex2 lock create ... and diffing against the current lock is more what you want, but it's a lot more output and it won't currently work since things like the platform_tag and pex version can change from run-to-run if the Python picked to run the lock changes or Pex has been upgraded even though neither is material to the --check.

jsirois commented 2 years ago

I guess adding an optional 'check' argument to --dry-run; i.e: --dry-run check which ran a --dry-run but exited non-zero if there are updates might make sense.

jsirois commented 2 years ago

The pex3 lock update --dry-run check implementation is here: https://github.com/pantsbuild/pex/pull/1799

taylorm0192 commented 2 years ago

Had some discussion w/ Stu on this issue on Slack (https://pantsbuild.slack.com/archives/C046T6T9U/p1657905057557149).

Settling on the idea of using as open as possible constraints for python_requirements in generating the python-default resolve, we would like to reduce the testing overhead of adding a new requirement (esp. in the early days), so to us this would look like pants generate-lockfiles --pin-already-resolved-requirements --resolve=python-default. That is, attempt to generate the new lockfile, but replace any requirement constraints with pinned constraints from the current lockfile. If it ends up not being possible, error out.

Would also be helpful to get a diff/migration report when the goal is completed successfully for a pre-existing lock.

danxmoran commented 2 years ago

I think my ideal UX for this would be for generate-lockfiles to only update things that have changed in python_requirement targets by default, and to only do a full regeneration if you pass an option. I've almost never wanted generate-lockfiles to bump the versions of floating transitive dependencies - instead I typically want it to either add a brand new dependency or update the version of a top-level dep.

Eric-Arellano commented 2 years ago

I think my ideal UX for this would be for generate-lockfiles to only update things that have changed in python_requirement targets by default, and to only do a full regeneration if you pass an option.

Thanks for the suggestion! I think this would be feasible to pull off because we have the lockfile header with the prior requirement versions written down. If that lockfile header is missing, then we do the next best thing of generating the whole lock.

cognifloyd commented 1 year ago

This seems related to #12880

cognifloyd commented 1 year ago

Also related: #15275

jonasrauber commented 3 months ago

This would be nice! For comparison, poetry has a --no-update flag on it's lock command that does exactly this: https://python-poetry.org/docs/cli/#lock

jsirois commented 3 months ago

As has Pex, since 2022. Pants has been exceedingly slow to adopt the Pex support. There was a recent engagement that produced fixes and more features from Pex for all this, but still no discernible motion from Pants.