conda / conda-lock

Lightweight lockfile for conda environments
https://conda.github.io/conda-lock/
Other
467 stars 102 forks source link

Support Changing Dependencies in an Existing Lockfile #370

Open srilman opened 1 year ago

srilman commented 1 year ago

Checklist

What is the idea?

It would be nice if there was some way to make updates to dependencies in an existing lockfile with minimal changes, such as:

All of these options are different from conda-lock --update in that these all require some sort of associated change in the source files used to generate the lockfile, whereas conda-lock --update is mainly expects that the sources have not been changed in any way.

Why is this needed?

These are all useful features in a situation where an end user would like to modify a single package without any other non-related changes. For example, if someone wants to test upgrading a dependency to the next major version without any other side effects.

Note that we can currently achieve all of these by rebuilding the lockfiles from scratch. However, that could lead to other unwanted dependencies changing.

What should happen?

Initials thoughts is that we have an additional sub-command like conda-lock update which only takes in a lockfile and change to make. It will update the lockfile with the change. It should also somehow indicate that the lockfile may no longer match the original sources.

Additional Context

A quick test with mamba makes it seem like these should all be possible, but I'm not confident (and not sure about conda or micromamba).

maresb commented 1 year ago

I wonder if lockfiles are the appropriate tool for achieving what you want. Maybe you just want to export your environment after updating a package?

In my mind, a lockfile is by definition a function of:

In mathematical notation, the function is $$\textrm{lockfile}(\textrm{sources}, \textrm{packages}(t)).$$ Thus there are exactly two ways a lockfile can change: either the source files change, or time has changed and the available packages have consequently changed.

Therefore, if a lockfile is modified by some external process, from this perspective it becomes a "lie".

Do you think environment exports cover your use case, or is there still some added functionality you'd like to see?

srilman commented 1 year ago

Environment exporting may work, although I don't really see how they would be different from using a lockfile. Except for maybe that we couldn't update arbitrary platforms. I feel like these are features that would be useful in most package managers with lockfiles.

However, I do agree that we should avoid modifying the lockfile so that it no longer applies to the source files. However, I'm not really 100% sure how to achieve this. I was inspired by the poetry lock --no-update command. When I was working on a poetry project recently, I was able to add additional dependencies and remove dependencies from my pyproject.toml and use that command to minimize the amount of changes to my poetry.lock.

I'm not really sure how Poetry is able to do this though. I wonder if they use Git history in some way, or use the current lock contents as a baseline. I know that some lockfile specifications sometimes include a copy of the source file contents in the lockfile; that may achieve the same thing. Another option is to add a new flag to dependencies in the lockfile indicating that it is a source dep or a sub-dep?

maresb commented 1 year ago

To correct what I wrote earlier, there are usually many possible solutions to the specifications, and as long as the lockfile contains one solution it should be fine. (I am also interested in reproducibility of lockfiles, especially when given a timestamp, but I need to remind myself that this is not a requirement.)

If I'm understanding correctly, you are interested in a mechanism which would allow for simultaneously changing the lockfile and source files? That sounds interesting but potentially very subtle. It's easy to imagine situations where there are shared subdependencies so that upgrading one thing requires either one or the other package to be updated, and each possibility leads to a chain of updates. I don't understand mathematically how this is handled, but I am very curious.

I also don't understand how conda-lock --update works. I should probably read through the source, because until then I probably can't provide much insight. :smile:

jayqi commented 9 months ago

+1 for this being something that is useful and that I expected from a locking tool.

My use case is that I want to maintain a lockfile that specifies an environment that I have an audience of users who will develop code to run in. I need to flexibility to be able to change the environment (mostly add additional dependencies) while trying to keep the changes as minimal as possible to avoid the risk of disruption. (i.e., I don't want adding a dependency to upgrade a version of an unrelated version of the package unnecessarily.)

For context, as far as I understand it, this is the default behavior of pip-compile. You need to use the --upgrade to opt-in to upgrading dependencies.

(As a relatively uninformed opinion, the "function" mental model is not what I have for locking. My mental model is that dependency resolution is just finding some solution to a set of constraints, and writing the full specification of that solution down.)

nicoddemus commented 7 months ago

+1 here as well.

We have recently migrated to lock files, and we often want to add a new dependency by adding it to the source file, but we do not want to update everything else at the same time.

To illustrate, suppose we have source.yml (the constraints) and locked.yml (the file generated by conda-lock), this is the workflow I would expect/like:

This feature is important, as often updating everything in the lockfile needs to be carefully tested, best being done in a separate PR/change than the one adding a new package.

As @jayqi mentions, this is the default behavior of pip-compile and it works well.

Finally, many thanks to the contributors for conda-lock, we appreciate how hard it is to maintain a tool like this.