conda / conda-lock

Lightweight lockfile for conda environments
https://conda.github.io/conda-lock/
Other
467 stars 102 forks source link

`--update` ignored for packages not in existing lockfile #386

Open timsnyder-siv opened 1 year ago

timsnyder-siv commented 1 year ago

Observed Behavior

Adding a package to an input specs file and then asking for the conda-lock to --update that package causes the environment to be completely solved and the --update request is silently dropped on the floor.

Expected Behavior

I would expect one of two reasonable outcomes:

  1. conda-lock errors and says that it cannot update the lockfile for something not already in the lockfile
  2. conda-lock does the best it can to solve the environment in a way that is consistent with the requirements and fail if the solver can't find a solution.

Acknowlegement of Difficulty

If you try to ask conda (22.11.1) to update a package that isn't already installed, you get a message like:

-bash-4.2$  conda update fsspec

PackageNotInstalledError: Package is not installed in prefix.
  prefix: /opt/conda
  package name: fsspec

And then a typical user will likely just conda install fsspec and and not realize that it might be impossible to give the solver a set of requirements that would put them back into this exact set of packages because any version requirements they had when previously creating or installing are not maintained across conda invocations (unless they are actually 'pinned').

So, I'm definitely not asking for you to make it possible for conda-lock to give me an inconsistent environment. There will be plenty of cases that just can't be solved for incrementally adding a new package. However, fail loudly when that happens.

Concrete Example

In my case, I want to add fsspec to https://github.com/firesim/firesim/blob/a81e3725c11406ea740c3f1a893d344e049f458f/conda-reqs.conda-lock.yml. To do that, I:

  1. add fsspec to https://github.com/firesim/firesim/blob/a81e3725c11406ea740c3f1a893d344e049f458f/conda-reqs.yaml
  2. conda-lock -f conda-reqs.yaml -p linux-64 --lockfile conda-reqs.conda-lock.yml --update fsspec

NOTE: I don't necessarily install the package into the conda environment because conda-lock doesn't use the contents of the currently active environment in it's calculation of the lockfile, does it?

However, I find that unless package you're trying to add is in the lockfile, the intersection() at https://github.com/conda/conda-lock/blob/854fca9923faae95dc2ddd1633d26fd6b8c2a82d/conda_lock/conda_solver.py#L151 filters the added package out of to_update and conda-lock effectively re-solves the entire environment, it doesn't use the update code.

Also, I think it is mildly confusing to suggest step 1 is installing the package into the currently active environment because it implies that having the packages installed somehow influences how conda-lock creates the lockfile, which I don't think is correct. If you are using a lockfile to reproducibly create an environment, it would probably be better to create the updated lockfile and then use it to create the environment. I haven't tried installing on top of an existing environment to see what happens but I'd expect it to be the same as removing the environment completely and reinstalling everything in the lockfile (but perhaps with optimizations for leaving unchanged packages alone).

_Originally posted by @timsnyder-siv in https://github.com/conda/conda-lock/pull/355#discussion_r1111114862_

srilman commented 1 year ago

I believe the--update flag is only officially supported for one very specific use case. Assume the following:

Only in this situation is it recommended to use the --update flag. There is a feature request open to support other forms of updates: https://github.com/conda/conda-lock/issues/370, but we're currently in the discussion phase. If you have any ideas, please let us know!

Apologies for the confusion! I noticed that the documentation around the --update flag is a bit sparse and a little misleading; we can definitely make it clearer. In addition, I think we can add some checks during runtime to ensure that the source files have not been changed. Maybe we could use the source hash in the lockfile? I know there is a --check-input-hash argument that does something similar, but that's opt-in.

timsnyder-siv commented 1 year ago

Thanks for the reply @srilman ! You could add a simple check at the code I pointed out in the original post: https://github.com/conda/conda-lock/blob/854fca9923faae95dc2ddd1633d26fd6b8c2a82d/conda_lock/conda_solver.py#L151

If my usecase isn't supported, I would suggest that you do something like:

    for _ in set(update) - conda_locked:
        warn(f"--update only works for previously locked packages. Ignoring {_}")  

I'd almost rather that you raise RuntimeError() with the message instead of ignoring the input.

Thanks!