conda / ceps

Conda Enhancement Proposals
Creative Commons Zero v1.0 Universal
19 stars 24 forks source link

CEP 4 - Deprecate `pip` interoperability and Default `pip` to have `--no-deps` in all environments #4

Closed beckermr closed 2 years ago

beckermr commented 2 years ago

Here it is!

@jezdez @ocefpaf

beckermr commented 2 years ago

We should have pip refuse to touch packages managed by conda as well. Idk if there is a mechanism to make that work. This may be related to the recent pep?

msarahan commented 2 years ago

Sorry I missed this discussion today. It might be worth involving the PyPA people in this discussion. While using --no-deps certainly makes it harder for people to shoot themselves in the foot, it also makes it harder for them to get something working. I can imagine that this PR may lead to lots of import errors, which may or may not be straightforward to people, especially when the import name does not match the package name.

Edit: in other words, please take a holistic look at how many headaches are created either way. As package and package manager maintainers, this community is strongly biased towards curing the problem of pip-trampled environments. In practice, what fraction of pip installs lead to issues? I fear that this may unintentionally turn people away from the conda-based ecosystem because the import errors will be more widespread than the pip trampling issues currently are.

beckermr commented 2 years ago

Yep. The goal is to allow an override of this via item 3 in the spec.

beckermr commented 2 years ago

Which would solve your very valid user experience.

msarahan commented 2 years ago

Requiring people to change a setting is still expecting too much from many users. @ocefpaf may have better experience here from supporting many more users, but in my experience, if people see any kind of traceback, especially one that is indirect (here an import error happens because conda will have changed pip), they throw up their hands and try something else. The availability of an option definitely is not sufficient to avoid massive impact here.

Edit: in order to make the import error more readily associate-able with the pip behavior, you'd probably have to patch python itself to alter the text of the import error such that it would describe this --no-deps behavior. That sounds like a pretty horrific change.

beckermr commented 2 years ago

Unfortunately, the current behavior has the same "throw your hands up in the air" problem. People get their environments in a weird spot due to not having used pip with --no-deps and go and try something else. In other words, the impact of the current defaults is already hugely massive.

The actual question is what default do we want? Both options have a massive impact no matter what.

ocefpaf commented 2 years ago

My original idea was to not add that option by default. Just issue a warning that when pip installing things in a conda environment we recommend --no-deps. Ideally we could have a [Y/n] prompt and the user can proceed, but warned, or cancel and try --no-deps knowing that there is some extra work. Not sure about the [Y/n] prompt but I prefer a warning then any setting that will change the expected behavior, even when that behavior can be harmful.

beckermr commented 2 years ago

Even a [Y/n] would break the current system in say a CI environment. Maybe the minimal change here is to add an option to turn off pip with dep resolution? This might also be easily achievable through environment variables that control pip itself. It'd be nice for conda to inject them I suppose.

msarahan commented 2 years ago

What if you patched pip such that it behaved with --no-deps, but verbosely described the dependencies that are not currently available? Icing on the cake would be to output a conda command that takes care of installing them, or even a prompt to run such a command. If packages aren't available on conda, then output a pip install command to take care of remaining deps (all with --no-deps). This has to be recursive, which may make it infeasible.

This pip command has been modified to avoid interfering with your conda environment. It does not use pip to install dependencies (using the --no-deps flag). The following dependencies are not satisfied in your environment:

Available from conda:
- x
- y
- z

Available only from pip:
- a
- b
- c

The key thing is that users need to expect that pip installs won't work "out of the box" most of the time. The easier you can make it for them to get them working, the more successful you'll be.

ocefpaf commented 2 years ago

This has to be recursive, which may make it infeasible.

I love your energy on your suggestions :smile:

Even a [Y/n] would break the current system in say a CI environment. Maybe the minimal change here is to add an option to turn off pip with dep resolution?

Yeah, I'm on the fence with a [Y/n]

This might also be easily achievable through environment variables that control pip itself. It'd be nice for conda to inject them I suppose.

I believe that every pip option can also be an env var, so that would be relatively easy to implement.

jezdez commented 2 years ago

We should have pip refuse to touch packages managed by conda as well. Idk if there is a mechanism to make that work. This may be related to the recent pep?

It is, the PEP 668 describes an EXTERNALLY-MANAGED ini-style file that can be dropped into a Python context, e.g. a Conda environment or a virtualenv (e.g. /home/jezdez/miniconda3/envs/spameggs/lib/python3.9), to prevent pip (or any other PEP 668 compatible Python package manager) to install in that context.

jezdez commented 2 years ago

Oh and it can contain a custom message that is printed when that tool tries to install something in that context.

beckermr commented 2 years ago

Thanks @jezdez! Refusing to install anything is too disruptive too. 🤔

jezdez commented 2 years ago

Thanks @jezdez! Refusing to install anything is too disruptive too. 🤔

To clarify, I'm not suggesting that this is what we want, just noting that's how the PEP is currently written. There is a chance we might have an impact on this PEP though since it's still a draft and @geofft and me have talked before about how it would impact Conda.

I would also suggest to bring him into this conversation and also @pradyunsg and @FFY00 who know more about the various areas of work in pip and the stdlib. E.g. @FFY00 has brought bpo-43976 to my attention that also caters to the idea of distributors able to impact how pip (et al) install packages.

beckermr commented 2 years ago

Thanks, @jezdez! You are correct.

I guess the thing that trips me up is that the current user story here seems like "here is a loaded gun. good luck!"

There could be nothing to do about that though.

msarahan commented 2 years ago

I think the current user story gives the user something akin to a exoskeleton - a tool that augments their ability. Attached to that exoskeleton is a welding/cutting torch, but no one ever warns them that using that torch may break other parts of their tool.

The issue is not that anyone says "good luck" - it's that the tool is there to be used (and widely documented in web examples), but the danger of using it is never stated, and most often discovered after disaster has occurred.

I'd argue that anyone who knows that pip can corrupt conda environments is not the target audience here. They are people who create their own purpose-built environments, who won't be totally hosed when pip clobbers something. The target audience here is the inexperienced user who will accidentally corrupt their root environment that they've had for 6 months, and end up stuck on how to recover.

pradyunsg commented 2 years ago

I see a mention here, but I’ve not read the discussion so far — just the patch. I do see links to PEP 668 though.

Given that there’s no motivation section, it’s difficult for me to judge whether this is a good idea. I’m gonna say a bunch of things, based on what I can infer. This might all be moot if the motivation for this CEP is correctly placed. :)

Instinctively, this seems like a bad idea. This is swapping out one annoyance for a different one, with not enough benefits for the churn it’ll cause. No user expects to run pip install requests followed by import requests to fail with an ImportError; so you’d be adding a new source of confusion and frustration. It’ll maybe slightly less nuanced to explain, but it’ll still be fundamentally the same experience and still bad. You can still break your environment. You can still end up in a situation where conda and pip clobber over each other.

IMO, what will be an improvement is to make pip know what not to touch, effectively solving the interoperability story from the other side as well. That’s what PEP 668 does.

beckermr commented 2 years ago

Agreed it'd be great to have them know how to not clobber one another. My understanding of the current PEP is that it is not fine grained enough to cover the use cases here. I may be a totally wrong on that though.

@jezdez I think we can close this one out as rejected or deferred or w/e you think is appropriate.

beckermr commented 2 years ago

Also thank you to everyone for the good discussion on these issues!

pradyunsg commented 2 years ago

My understanding of the current PEP is that it is not fine grained enough to cover the use cases here.

I won't claim to know the use cases too well myself and you probably understand your own use case better than I do. ;)

If folks genuinely have this feeling though, now is the best time to flag them in the discussion about this PEP. It's very much an actively under discussion. If we aren't accommodating for conda properly, I'd really appreciate if you could elaborate on that here: https://discuss.python.org/t/10302/17 (needs registration, can login with GitHub).

I'm gonna guess that the concern is around mixing packages from conda and pip, or something along those lines? I'm sure there's an answer we can figure out for that, and putting details on that thread would be a good start toward that.

beckermr commented 2 years ago

I am very happy to comment on the PEP to help move it along! Thanks for suggesting.

I'm gonna guess that the concern is around mixing packages from conda and pip, or something along those lines? I'm sure there's an answer we can figure out for that, and putting details on that thread would be a good start toward that.

Yep. There are some cases where you want to install a package with pip into a conda environment. These include

  1. Development (i.e., pip install -e .) though my guess is directly invoking the setup.py is a work-around here.
  2. Certain ways in which packages link (e.g., mpi4py needs to link to the non-conda MPI binaries on some systems and a pip install of it is very useful for that - though note conda-forge has better ways of doing this)
  3. Dependencies not in conda (e.g., sometimes for practical reasons one needs a pip installed dependency that is not in conda-forge or conda or w/e)

The key is to do it cleanly and of course, being able to undo all of the above things cleanly. Just to be more specific ideally (note some of these work great already!)

  1. pip would never try to uninstall something already installed by conda
  2. would be able to uninstall anything it did install and know what those things are (related to 1 here ofc)
  3. would help the user defer to conda for dependencies

On this last point, the namespace management for packages is really hard here, but a warning about what pip would have installed without actually doing it might be useful per comments from @msarahan.

I am happy to move these comments to the PEP above or at least point folks back to this thread if you think it is useful @pradyunsg!

FFY00 commented 2 years ago

Development (i.e., pip install -e .) though my guess is directly invoking the setup.py is a work-around here.

This has been standardized, so we could implement it in conda.

pradyunsg commented 2 years ago

Development

Could you clarify what you mean by this?

beckermr commented 2 years ago

Sure!

When developing a package, you generally want the deps installed by conda but not the actual package. I commonly use a workflow like this. Suppose I am developing package foo. Then I would do in my local conda env

conda install foo  # gets the latest version w/ deps
conda uninstall --force foo  # removes only foo from the env, leaving the deps
cd /path/to/foo/locally
pip install -e .

Then when I am done

pip uninstall foo  # remove the dev install
conda install foo  # put back the conda install

There may be a better workflow here, but that is what I have in mind.

pradyunsg commented 2 years ago

Ah, I understand your workflow now! I don't understand the issue with the workflow though. Does this not work at some specific point right now?

  • [pip] would be able to uninstall anything it did install and know what those things are
  • pip would never try to uninstall something already installed by conda

This is precisely what PEP 668 is designed to do. pip would be able to see things installed by conda, and won't touch them.

would help the user defer to conda for dependencies

Well, to set expectations, this isn't going to happen directly in pip, anytime soon. This will need some sort of abstract standard for defering to external package managers, and that's... broad.

That said, for the user, what you just described of "get things I need from conda, everything else from pip" should work fine without clobbering after PEP 668 is "out in the wild".

beckermr commented 2 years ago

Does this not work at some specific point right now?

This works fine ofc!

My motivation in this CEP is for cases 2-3 above.

a) Many folks load up their conda environment and then without thinking do things like

pip install my-local-package

When this package has dependencies listed in the setuptools metadata, pip goes out and installs them. That can break the conda env or make things super weird when conda comes back later with the same package.

b) There are valid cases where one wants pip to install to a conda-managed python interpreter. We can't mark the whole thing as externally managed.

This is precisely what PEP 668 is designed to do. pip would be able to see things installed by conda, and won't touch them.

What I don't follow in the spec is that there is this single file in the tree. How does that let pip know about the status of each package in a dir with this file? Sometimes people correctly use pip to install a package in the conda env as a valid use-case (e.g., when it is not in conda).

beckermr commented 2 years ago

I think my understanding of the spec is correct given the text in the Use Cases section but I could be wrong OFC.