pypa / pip

The Python package installer
https://pip.pypa.io/
MIT License
9.46k stars 3k forks source link

Make install command upgrade packages by default #3786

Closed pradyunsg closed 7 years ago

pradyunsg commented 8 years ago

Based on discussion over at #59, there's interest in making the install command upgrade an installed package by default. This behaviour would make pip consistent with various other package managers, with regards to the behaviour of it's install command.

This issue is meant to be the location for that discussion, since this deserves it's own issue.

njsmith commented 8 years ago

Okay, here's a proposal:

End goal (where we want to end up)

Transition option A

Transition option B

KISS: skip phase 1 and go directly from phase 0 to phase 2. Rationale: it's not clear that this will actually break anything, people are going to be somewhat confused and annoyed in either case, it's entirely possible they'll be more confused and annoyed by the phased transition than by the actual change, we have limited resources, and we're eager to get to the shiny new future.

In this version we can also probably skip adding --upgrade-non-recursive, since its immediately redundant as soon as it's introduced.

Comment

I'm sorta expecting that everyone will push back and insist on transition option A instead of transition option B. But I'd actually be happy with either one, so instead of pre-emptively compromising I'm going to let someone to else to make that argument (if they want to) :-).

xavfernandez commented 8 years ago

Hmm, I don't see the added-value of your pip require ? It looks like a duplicate of pip install ? Or maybe a pip install --no-upgrade ?

pfmoore commented 8 years ago

I'm happy enough with option B.

But I don't follow your description. You say pip require foo: same as the current pip install foo. So it'll error if foo is installed? And pip install --upgrade-recursive foo: same as the current pip install --upgrade foo. I thought there were problems with the existing install --upgrade behaviour (beyond it not being the default) - there's a whole load of discussion somewhere about needing a SAT solver. Is your proposal that we don't do anything about those issues? Or am I misremembering and there's not actually a problem with the current --upgrade behaviour?

dstufft commented 8 years ago

I'm happy with option B.

I don't like the idea of a pip require command for the same reasons I didn't like the split pip install and pip upgrade commands. Two commands that do sort of the same thing but not quite forces people to make a decision about which one they use up front, versus using flags. I also think that it's good practice for boolean flags (ones that toggle something on/off) to have an inverse wherever it makes sense, to allow people to compose commands better.

So with all that in mind, here's what I would do:

You might notice, that there's nothing like the current behavior listed so far, a "upgrade everything in the dependency path to the latest version" sort of flag. I'm on the fence about if we really want something like that (and if we want it, do we want to keep it forever, or would it just be a temporary shim to ease transition). Another thing to keep in mind when deciding this is how the theoretical pip upgrade command affects this decision. In other words, if we have a command to upgrade all the installed items, do we foresee people ever wanting to upgrade X and all of it's dependencies?

If we do want something like the current --upgrade behavior, then I think I see two options:

In terms of the dependency resolver, I don't think these two issues are really intertwined that much. Our resolving is currently a problem in both the pip install and the pip install --upgrade case, and I believe it will continue to be a problem with the proposed changes. It's something that needs fixed, but I don't think it has any bearing on what we do here (although it likely does have some bearing on the hypothetical pip upgrade command).

pfmoore commented 8 years ago

I'm not aware of a strong requirement for the current behaviour (by "strong" I mean "anything other than backward compatibility"). But if people did need it, they can get it by simply listing all of the dependencies on the command line.

It's pretty easy to write a script to show all (recursive) dependencies of a package:

# reqs.py
import sys
from pkg_resources import get_distribution

def reqs(req):
    results = []
    queue = [get_distribution(req)]
    while queue:
        current = queue.pop()
        results.append(current)
        for r in current.requires():
            d = get_distribution(r)
            queue.append(d)
    return results

if __name__ == '__main__':
    print('\n'.join(sorted(set(d.project_name for d in reqs(sys.argv[1])))))

Then you just do pip install $(reqs.py foo) to get an "eager install" of foo and its dependencies. I'm sure there are shortcomings with this approach, but is the problem common enough to warrant a more complex solution?

dstufft commented 8 years ago

@pfmoore well that script only works if no dependencies have changed between the currently installed versions and the to-be-upgraded-to versions (and of course, it assumes everything is already installed).

That being said, the only real use case I can come up with is installing a project into an environment that already has stuff installed into it, but wanting to have the latest version of dependencies. IOW, a framework like Pyramid might prefer that new users install it's dependencies using the recursive upgrade. HOWEVER, even in this scenario, (which is the only one I can think of) if the hypothetical Pyramid's version specifiers are all correct, then the end user should expect it to work regardless (and it's similar in nature to what folks would get already in the current pip install behavior with something already installed).

If someone does want "Pyramid, and all of it's dependencies up to date", it's somewhat nicer than the proposed way of doing that (combining the two proposals), which would be pip install Pyramid && pip upgrade (which isn't exactly the same, since pip upgrade would do more than just Pyramid).

So that's my hesitation, is that I struggle to come up with a scenario where it's the clear cut right thing to do, but it could make some edge cases moderately nicer. We could always leave it out, and if we come across people asking for it add it in again at that point in time.

pradyunsg commented 8 years ago

I dislike both A and B. I don't like the idea of introducing a new command, nor do I want to switch to the new behavior without some "deprecation" style period for the current behavior. Hence I put forth my own proposal below.

I'm not aware of a strong requirement for the current behaviour

Me neither. Yet, I don't want to break someone's working code without telling them. I would find it rude. 'Don't do unto others what you don't want others to do unto you.' This is why I think I don't want to switch with no warning as in @njsmith's B option either.

If someone does want "Pyramid, and all of it's dependencies up to date", it's somewhat nicer than the proposed way of doing that (combining the two proposals), which would be pip install Pyramid && pip upgrade (which isn't exactly the same, since pip upgrade would do more than just Pyramid).

As I understand it, If someone wants "Pyramid, and all of it's dependencies up to date", after the switch to the new behavior, it's pip install --upgrade-strategy=eager Pyramid. That would eagerly upgrade Pyramid and it's dependencies to the latest version, regardless of whether an upgrade is unnecessary.

I thought it was clear that we wanted to provide both the current "recursive-latest" and the new default "only-if-needed" upgrades. Just emphasizes that I need to post the common accepted ideas.


Proposal

  1. Make a major version release that deprecates current behavior and provides a warning on use of these commands with opt-in flags and configuration to the new behavior.

    • Flag(s) should be provided to allow the user to check out the new behavior to be introduced. Using the flag(s) in this version would imply --upgrade.
    • maybe, pip install --upgrade warns that this flag will become no-op in next release.
    • pip install warn that the behavior is changing in the next release and current behavior won't be available in the next release.

    Possibly, both warnings provide a link to documentation that suggests to the user what they should do.

  2. Switch to new behavior in next major version release.
    • If someone really needs the current behavior, a --no-upgrade flag may be added. But I don't want to see that unless someone really needs it.

Bikeshed: Options and flags in 1. I prefer to add a --upgrade-strategy=(eager / non-eager / default) as the flag in 1 and switch the default strategy to eager in the 2.

pradyunsg commented 8 years ago

Also worth pointing out, explicitly, there is no need for a dependency resolver in pip for this. While with the new behavior it's still possible to break some line edge in the entire dependency graph, it becomes less likely if you upgrade less often.

how dependencies are handled

Uniformly independent of depth. The user can choose between eager and non-eager upgrades. They are as I had define in my earlier write-up.

what happens with constraints (and when they conflict)

I would say whatever happens today.

binary vs source

To be handled in #3785. Until then, keep as is.

pfmoore commented 8 years ago

I think it was clear that we wanted to provide both the current "recursive-latest" and the new default "only-if-needed" upgrades.

Nope, I don't think so. The "only if needed" behaviour is, as far as I know, agreed by everyone as what we would like to have available. But I understood the current behaviour to be generally considered as having issues. Whether those issues all revolve around the "pip needs a proper dependency resolver" problem, and we're OK with keeping the current behaviour until that is fixed, I don't really know.

dstufft commented 8 years ago

The main problem(s) with the current behavior (that isn't actually a result of the lack of a real resolver) is that the "greedy-ness" of it causes things to be upgraded that might not otherwise be upgraded. On the tin that doesn't seem like a big problem, however it has some subtle (and some not so subtle) interactions:

The first two of those are things that could possible be fixed, at least in part, by other solutions (and for which, this solution isn't a total fix either). You could fix the breaking of the OS by making pip smarter about not mucking around with the OS files by default. Wheels make it easier to install even hard to build libraries like Numpy but not everything has a Wheel, and if you're on anything that isn't Windows, OS X, or manylinux1 then your chances of getting a wheel are basically zero.

The churn on what is installed is only going to be fixed by this patch, as well as reducing the occurrence of the first two issues (by being more conservative when we actually attempt to do anything).

dstufft commented 8 years ago

Of course, this is a super subtle sort of difference and it's hard to nail down all of the exact benefits (they'd be more accurately described as trade offs, rather than a straight set of benefits). I don't know if the old behavior is something that, in the cases it's useful, it's useful enough that people would bother using a flag for it or not. If we add the flag, it becomes hard to ever remove it, if we don't add it now, we could always add it again in the future, so for that reason i lean somewhat towards leaving it out and waiting to see if we get people asking for a way to bring the old behavior back.

pradyunsg commented 8 years ago

I think it was clear that we wanted to provide both the current "recursive-latest" and the new default "only-if-needed" upgrades.

Nope, I don't think so

Hmm... I did think that both behaviors were seen as useful. That's what the Pyramid example made me think. It's using the current behavior and it does exactly what is desired.

It seems desirable to be able to say "upgrade pkg and all it's (sub-)*dependencies to latest version". I don't want to upgrade everything in my ecosystem, I just want to get the latest bug-fixes for pkg and dependencies.

  • Some libraries are very expensive to install build, particularly ones like Numpy where compiling can take 30+ minutes.

By conservatively upgrading packages, it does make this happen less often.

Edit: You mentioned that.

  • The recursive upgrade introduces more churn on the installed set of packages, which increases the likelihood that something that was already working, breaks because of an upgrade to a shared dependency.

This needs a dependency resolver to be fixed. I consider that out-of-scope of this issue.

If we add the flag, it becomes hard to ever remove it, if we don't add it now, we could always add it again in the future, so for that reason i lean somewhat towards leaving it out and waiting to see if we get people asking for a way to bring the old behavior back.

That works pretty well with me. Adds to why I want a "deprecation" release for the current behaviour to get people asking for it to stay, rather than re-added.

Edit: s/version/behaviour/


:confused: Any comments on my proposal above?

dstufft commented 8 years ago
  • The recursive upgrade introduces more churn on the installed set of packages, which increases the likelihood that something that was already working, breaks because of an upgrade to a shared dependency. This needs a dependency resolver to be fixed. I consider that out-of-scope of this issue.

No, this isn't related tho the dependency solver thing. This is just "software is hard, and new versions sometimes add new bugs, therefore, the more churn you have, the more likely you are to get bit by new bugs".

The most stable (in terms of new, not previously encountered bugs) software is software that never changes.

Any comments on my proposal above?

I'm a little concerned about adding a warning for every invocation of pip install, but I'm not opposed to it-- it's certainly the safer route though and it's one that's more in line with our typical deprecation process and it gives a chance for people to clamor for an option to use the old behavior.

I do think that we need to either deprecate the --upgrade flag completely as part of this (probably no-op it and hide it for a long while), or we need to add --no-upgrade to get back to the old behavior of pip install .... I don't want a fairly useless --upgrade flag laying around in our help. So then the question for a --[no-]upgrade flag becomes whether we see the current behavior of pip install useful at all. Here again I don't have a strong opinion-- We could use the deprecation period again as a chance to see.

pfmoore commented 8 years ago

Any comments on my proposal above?

Honestly, I really don't like the idea that essentially every invocation of pip install will give a warning for a full major release cycle. That seems guaranteed to just annoy users, and as a result we'll probably get no useful feedback, just a lot of complaints about the process.

My preferences remain with @njsmith's approach - probably the "just go for it" approach, but if necessary the gradual version.

I have to admit that I find it very hard to understand the impact on my day to day usage of these various proposals. There's a lot of theory and edge cases being discussed, which is obscuring the key points. I think that whatever transition process we adopt, someone should work on a clear "press-release" style description of the proposed changes and their impact, which we can publish on distutils-sig before making the changes. That should allow us to gauge reactions from the wider community. (I don't think this needs a PEP, but I do think it needs publicising).

My instinctive feeling is that I'll be (mildly) happy by the new "as little as possible" upgrade behaviour, mildly irritated by the fact that "install" now upgrades without an explicit flag (but hopefully I'll get used to it reasonably quickly) and otherwise mostly indifferent. My main usage will probably remain pip install new_thing to install a new package and a manual "get all the package names, and do pip install <all of them at once> to manually simulate "update all". Neither of these will be affected by any of the proposals (except that the new "as little as possible" upgrade strategy will avoid the odd unwanted numpy upgrade attempt that the current behaviour inflicts on me).

For me, the tipping point comes when --prefer-binary and "upgrade all" become available. Those will affect my usage, and it won't really be until then that I'll see any benefits (or issues) with the change to upgrade strategy.

pradyunsg commented 8 years ago

Honestly, I really don't like the idea that essentially every invocation of pip install will give a warning for a full major release cycle. That seems guaranteed to just annoy users, and as a result we'll probably get no useful feedback, just a lot of complaints about the process.

Indeed. I didn't think about that in a hurry to leave. Oops!

My point is, I really want pip itself to have a major version deprecation run with such a major change to the main command of it. Any form it takes, I'm game.

I think being selective about when we show the warning message is the way forward.

How do you choose? @njsmith suggested only when the behaviour differs. Other than the fact that it's essentially doubling the work done in every install execution, as long as we publicise well (in advance and detail), I think it's good idea.


edit

Or maybe not on second thought. It won't be showing the message to everyone like we would want to. I would want to show it to everyone at least once.

How about some configuration file magic, asking the user to set a flag in the configuration file? This is where an --upgrade-strategy=default or similar flag would come in handy.

Any alternate ideas for this?


the tipping point comes when --prefer-binary and "upgrade all" become available. Those will affect my usage, and it won't really be until then that I'll see any benefits (or issues) with the change to upgrade strategy.

True. While this change will fix some issues (unnecessary re-installs) directly, I think it will might indirectly help resolve other issues as well.

FichteFoll commented 8 years ago

Similarly to @pradyunsg's last idea, iirc git shows (kinda long) messages for when it introduced or is going to introduce a big change that you can disable by setting a configuration via commandline that is mentioned in the message. I've liked that so far.

dstufft commented 8 years ago

A temporary option to disable the message wouldn't be the worst possible behavior.

njsmith commented 8 years ago

@pradyunsg: Before we get into the nitty-gritty of deprecation strategies... is there any chance I can convince you that the "option B" approach is okay? (Normally I wouldn't try, but given that core devs like @dstufft and @pfmoore are okay with it I guess I will try :-).) I definitely understand why you find just switching to be "rude" to users, but it's a complex trade-off -- not switching is also rude in different ways to different people. For example:

8.1.2 flat out broke a bunch of people's deployments due to a complicated bug involving the interaction between pip, pkg_resources, and devpi. It sucked but people dealt with it. Given our limited resources, it's a fact that we're going to sometimes break things and sometimes leave broken things sitting for years without progress and generally cause users pain. We can't change that, but we can at least be smarter about which kinds of pain we cause users, and "install starts working the way lots of users already expect" is a much more productive outcome than most :-).


@pfmoore:

You say pip require foo: same as the current pip install foo. So it'll error if foo is installed? No, right now if foo is already installed then pip install foo does nothing and exits successfully. I was imagining pip require would be a way to directly talk to the constraint resolver: "here's a new constraint, please ensure it is satisfied". Semantically meaningful and well-defined, but a pretty low-level for-experts interface.

@dstufft: I find pip install --no-upgrade foo rather confusing, though -- from the name I'd expect that it would do something like... try to install foo but error out if foo had a dependency that would force the upgrade of something I already had installed? Which is kinda the opposite of what it would actually do. For me the require operation and the install operation are conceptually really distinct -- see also Guido's comments on how if you ever find yourself writing a function that takes a boolean arg, and you know that your callers will be passing a constant rather than a variable for that arg, then you should have two functions. So splitting it out into a new command was me trying to imagine what it might look like in a world where we added it for its own sake, rather than just to fulfill our obligation to have a --no form of --upgrade or whatever. But I'm also just as happy to drop it entirely for now...


Okay, how about this as a strategy:

That avoids the worst gratuitous breakage (there's no reason for pip install -U foo to become a hard error and invalidate tons of existing tutorials), but otherwise keeps things radically simple, so we can skip or defer thinking about things like --no-upgrade or the most ideal spelling for recursive upgrades and get the important parts moving ASAP.

njsmith commented 8 years ago

It seems desirable to be able to say "upgrade pkg and all it's (sub-)*dependencies to latest version". I don't want to upgrade everything in my ecosystem, I just want to get the latest bug-fixes for pkg and dependencies.

The problem with this is that in lots of cases, it doesn't really make sense to assign some dependency to any particular dependant. Like, lots of people have environments with ~30 different packages installed, of which 1 is numpy and 29 are packages that depend on numpy. So if I want the new bug-fixes for astropy, should that upgrade my numpy? That might fix some issues with astropy but it might also break the other 28 packages, who knows. Pyramid's dependency chain includes a number of widely-used utility libraries like zope.interface and repoze.lru and setuptools (why? idk). So recursively upgrading Pyramid might break Twisted (which depends on zope.interface and setuptools and nothing else). There's no way that "I want the latest bug-fixes for Pyramid" implies "I want the latest setuptools" in most users' minds -- but that's how pip install -U currently interprets it.

pradyunsg commented 8 years ago

Similarly to @pradyunsg's last idea, iirc git shows (kinda long) messages for when it introduced or is going to introduce a big change that you can disable by setting a configuration via commandline that is mentioned in the message.

That's exactly where I got the idea.

I've liked that so far.

Ditto. Hence I would like to see it in pip. It's a field-tested process.

I do agree that every-run-warning is a bit too much but having it show all the time until the user acts on it is something I know, from git, works even for major changes like this.

is there any chance I can convince you that the "option B" approach is okay?

Maybe. You're right the trade-offs are complicated and having to wait an year till the switch isn't the most convenient thing either. Breaking certain niche-cases that don't affect everyone is fine. That is just going to happen. Here, we're changing the most used command of pip (in documentation of packages and otherwise). Doing so without a proper warning period might just not be the best of things to do. Nor should this be done without giving people some time to fix their tools/workflow/etc to work with the new behaviour.

With @njsmith's current proposal, I still don't get a proper warning or give people some preview of the upcoming (major) change. That's all but it's enough that I don't like the proposal. If someone can convince me that dropping the these two requirements would be fine and it's possible to properly inform people that this, a big change, is coming their way in some other manner, I'm fine with that.

If we get the deprecation nitty-gritties right, it should possible to implement this in such a manner that the deprecation-release-only stuff stays in one module (module as in English; a class, function or something else) and the next major release just stops invoking that module and removes it. That way at least the post-deprecation work is minimized.

59 has 199 comments from 56 participants, many of them just +1's. Making them wait another year is kinda rude too.

They don't have to wait another year. They can just opt-in to the new behaviour. We're just giving time to people whose stuff broke due to the change. Others can just opt-in to the nicer behaviour.

We keep --upgrade around as a no-op indefinitely, but take it out of --help, and the reference manual just says "no-op; kept for backwards compatibility". (Maybe in a few years we tear it out entirely, maybe not -- I don't care and am happy to just defer that discussion until a few years have passed.) [snip?] That avoids the worst gratuitous breakage (there's no reason for pip install -U foo to become a hard error and invalidate tons of existing tutorials)

If it wasn't obvious, this would happen in my proposal's 1. No one gets bothered by a no-op -U's presence. It's absence will invalidate many packages' documentation and break stuff. We'll keep it till it is rare enough to be safe to remove. That discussion should happen a few years later. (let's mark 16th September 2018 for this, for no reason what so ever)

Regardless of whether I change my position on @njsmith's proposal, we'll keep a no-op --upgrade post-deprecation.


There's no way that "I want the latest bug-fixes for Pyramid" implies "I want the latest setuptools" in most users' minds -- but that's how pip install -U currently interprets it.

True. But this is due to the lack of a dependency resolver. Once it's added, it does exactly what the user wanted. There's only so much we can do till then. Adding a warning in the documentation about the potential breakage of the dependencies is sufficient for now IMO, since this behaviour shall become opt-in. And this assumes that the packages maintain their promises made through version-numbers. If they break, there's little pip can do until packages refine their version-specifiers.

As a side, I think there should be a piece of documentation mentioning that pip may break your dependency graphs.

So if I want the new bug-fixes for astropy, should that upgrade my numpy?

Not if it breaks your dependency graph. Neither if it removes your well-configured numpy. The former case needs a dependency resolver. The latter needs "holding back" of upgrades. Both out-of-scope in this discussion.

Until we get those, the most we can do is tell people - "pip doesn't do the right thing all the time and we don't have the resources to fix it. Help would be appreciated."


This is just "software is hard, and new versions sometimes add new bugs, therefore, the more churn you have, the more likely you are to get bit by new bugs".

I can only say, sad but true to this.

pradyunsg commented 8 years ago

I am posting what is the mental picture of the post-deprecation behaviour is in my head... Just to make sure I don't miss out on anyone's concerns.


Once we have decided upon the required behaviour, I'll start working on the implementation. (I'm still familiarizing myself with the implementation details of pip install and #3194 right now.)

Let's finalize the behaviour and how we want to do the deprecation here and we'll bikeshed the option names in the PR I eventually make.


pip install --target <dir> is documented as "By default this will not replace existing files/folders in

."

Since install shall now start upgrading (replacing) by default, it seems more consistent to replace the existing files and folders by default and provide some flag if the user wishes to have the older behaviour of not-replacing. AFAIK, this flag is undecided on. pip require has similarities. So, I think we can't defer the discussion on pip require and need to do it now.

The overlap with pip install and the need for it presented by install --target makes me want to have the require behaviour behind a flag in install.

njsmith commented 8 years ago

@pradyunsg:

Here, we're changing the most used command of pip (in documentation of packages and otherwise). Doing so without a proper warning period might just not be the best of things to do. Nor should this be done without giving people some time to fix their tools/workflow/etc to work with the new behaviour.

It's the most used command of pip, but we're only touching two weird corner cases: pip install foo where foo is already installed, and pip install -U foo where foo has some recursive dependency that's out of date. While I'm sure there will be some obscure breakage no matter what we do, I can't think of any sensible tools or workflows that would be broken by this -- can you give an example of what you're thinking of?

True. But this is due to the lack of a dependency resolver. Once it's added, it does exactly what the user wanted.

??? no idea what you mean here -- Pyramid recursively depends on setuptools, and my argument is that this demonstrates that "package and its recursive dependencies" doesn't actually correspond to any meaningful concept in the user's mental model. AFAICT this is totally orthogonal to the dependency resolver issue?

pip install --target <dir> ... Since install shall now start upgrading (replacing) by default, it seems more consistent to replace the existing files and folders by default

I think the issue with pip install --target <dir> is that it doesn't really install into an environment at all -- it's used for things like vendoring. And without an environment, the upgrade/install distinction doesn't even make sense. My vote is that we leave it alone -- the current behavior is fine IMO.

pip require has similarities.

It does?

pradyunsg commented 8 years ago

we're only touching two weird corner cases: pip install foo where foo is already installed, and pip install -U foo where foo has some recursive dependency that's out of date.

Hmm... Indeed. While the change is major, I do agree that it's just weird corner cases that we break. But I would really want to get some user input before making the change... It doesn't feel right to make such a change without a deprecation.

If everyone else here (mainly @pfmoore and @dstufft) says that they prefer no-deprecation switch over a deprecation switch, I guess I'll be fine with going ahead and implementing @njsmith's proposal.

True. But this is due to the lack of a dependency resolver. Once it's added, it does exactly what the user wanted.

Pyramid recursively depends on setuptools, and my argument is that this demonstrates that "package and its recursive dependencies" doesn't actually correspond to any meaningful concept in the user's mental model.

I disagree. It is a meaningful thing to want to get the latest possible version of a package and its dependencies. As an example, if I have found that my current environment has an issue related to pkgA, I would want to check against the latest releases of it and all it's dependencies to eliminate the possibility of this being an issue that got fixed in a new release. I think it's reasonable to expect that to be possible.

Just to be clear, Let's not provide the old behavior for the simple reason that it provides lazy people a way to keep the existing behavior if it works for them. We'll keep it only if we figure out some valid use-case. If we go down the deprecation path, it'll be deprecated but available till end-of-deprecation. If someone wants that behavior, they'll say they do and we'll pull it out of deprecation and let it stay.

AFAICT this is totally orthogonal to the dependency resolver issue?

The dependency resolver comes into play when A and B both depend on C, A is recursively upgraded, breaking C for B since pip does not care about B's version specifiers when handling A's. This was the example you gave with Pyramid, Twisted and zope.interface being A, B and C respectively.

pip require has similarities.

It does?

Yes, in that it also does not affect already-installed packages. But on reviewing this, they are more different than similar. This option is more along the lines of --avoid-installed. I don't know why I thought they were similar enough to merge...

pfmoore commented 8 years ago

@njsmith

No, right now if foo is already installed then pip install foo does nothing and exits successfully

What I see is

>pip install xlrd
Requirement already satisfied (use --upgrade to upgrade): xlrd in c:\users\uk03306\appdata\local\programs\python\python35\lib\site-packages

I'm not sure about the exit status, I was thinking about the user experience. Apologies, I was being sloppy in my wording - I meant that I "get an error message" (maybe it's technically a warning) rather than that pip sets the exit code to error. But either way it's a minor point.

Responding to other emails:

I agree with @njsmith that deprecation is in many ways just as bad an experience for users as a sudden change. In this case I remain in favour of just going straight to the improved version. There's been plenty of debate on the tracker, and lots of people have noted their interest in seeing the new approach land. @pradyunsg if you still feel that we should warn users, then by all means post on distutils-sig (and even python-list if you feel it's warranted) and announce the plan there. There's a risk that doing so results in even more bikeshedding and debate, which may or may not be productive, but that's the nature of packaging changes :-)

I'm also in agreement that I don't see "Pyramid and all its dependencies" as a particularly useful thing to want to upgrade. Pyramid itself, of course. And Pyramid and selected dependencies, quite possibly. And certainly "everything in this virtualenv (which was set up for my Pyramid development)".

Which prompts the thought - how often would people asking for eager upgrades be better served by using virtualenvs and upgrade-all? I can't speak for other people's workflows, but it's certainly how I tend to operate. And of course for many environments, pip freeze and exact version restrictions are the norm, so eager updates would be inappropriate there.

Finally, we've decoupled "pip needs a solver" from this proposal - so arguing that eager is useful once we have a solver isn't relevant right now. Current eager behaviour can break dependencies - so we should remove it, and then maybe reintroduce a working version once we have a solver and we've had feedback that (a not-broken version of) the feature is useful to people.

pradyunsg commented 8 years ago

if you still feel that we should warn users, then by all means post on distutils-sig (and even python-list if you feel it's warranted) and announce the plan there.

I think announcing on distutils-sig sounds fine to me. python-list, I'll think about it.

There's a risk that doing so results in even more bikeshedding and debate, which may or may not be productive, but that's the nature of packaging changes :-)

That's a trade-off. I guess I'll redirect them to the PR for the bikeshedding and take other comments on the mailing list...

Quick correction: I really should have mentioned the entire help-text of --target.

""" Install packages into

. By default this will not replace existing files/folders in . Use --upgrade to replace existing packages in with new versions. """

If we are making --upgrade a no-op, --target should not depend on it. We need to figure this out.

Finally, we've decoupled "pip needs a solver" from this proposal - so arguing that eager is useful once we have a solver isn't relevant right now. Current eager behaviour can break dependencies - so we should remove it, and then maybe reintroduce a working version once we have a solver and we've had feedback that (a not-broken version of) the feature is useful to people.

Sounds good to me. I guess we can drop the eager upgrade behavior. It's easy to add it if we need to. Removing it (after the switch), not so much. I do think not providing it and advocating use of virtualenv for the job is a good idea.

pradyunsg commented 8 years ago

@pfmoore I take it that you wish to go down the no-deprecation path.

I'm also in agreement that I don't see "Pyramid and all its dependencies" as a particularly useful thing to want to upgrade. Pyramid itself, of course. And Pyramid and selected dependencies, quite possibly.

When you put it that way, it makes sense why what I was saying is not ideal.

Current eager behaviour can break dependencies

I think any package change has the potential to. The non-eager behavior just reduces the number of changes and thus works around this issue fairly well enough to reduce breakages substantially.

Anyway, I take it that it's decided that eager upgrades would be dropped.

We need to figure this out.

Maybe reuse --force-reinstall? I don't know enough about these options to be sure...


@dstufft I'm waiting for your views on deprecation vs no-deprecation.

pradyunsg commented 8 years ago

So, that leaves us with --upgrade and --target only. (and @dstufft's vote)

I request anyone with any issues/requirements, that they feel haven't been handled, to bring them up now. Not that it's the last chance or anything, just a good time to do so.

pfmoore commented 8 years ago

Current eager behaviour can break dependencies I think any package change has the potential to.

Specifically current eager behaviour can leave the system in a state where declared dependency requirements (which aren't inconsistent, or otherwise broken) are violated when they were not previously. That is not acceptable, and is what a "proper solver" should address. For the simpler "only as needed" upgrades, my understanding is that the risk of such breakage is minimised even without a solver.

So, that leaves us with --upgrade and --target only.

Apart from changing the help text of --target to not refer to --upgrade, I consider --target to be out of scope here. The help text is

Install packages into <dir>. By default this will not replace existing files/folders in <dir>. Use --upgrade to replace existing packages in <dir> with new versions.

I propose we just replace it with

Install packages into <dir>.

Presumably the default will change (as with normal "install") to overwrite by default, and if you don't want to overwrite, you just don't run the install command (same as if you're installing into site-packages). If users want anything more complex, they can work out the appropriate commands, let's not worry about trying to offer suggestions (that may or may not be helpful in practice).

pradyunsg commented 8 years ago

The help text is

Install packages into

. By default this will not replace existing files/folders in . Use --upgrade to replace existing packages in with new versions.

I propose we just replace it with

Install packages into

.

Hmm... Are you sure that you want to remove the functionality of not replacing existing files/folders?

pfmoore commented 8 years ago

Are you sure that you want to remove the functionality of not replacing existing files/folders?

It's not me that was advocating that - @dstufft and @njsmith argued strongly that "install" should upgrade when given an already installed package. The only thing I'm adding is that I don't think the behaviour should be different just because the user specified --target.

Maybe having a --no-replace option is needed, but if so it should apply to both --target and non---target installs.

pradyunsg commented 8 years ago

Off Topic

At the cost of being picky, a tiny markdown suggestion/request/tip/{whatever_you_want_to_call_it} - Keep an empty > line in block quote to make it dedent... Otherwise it just merges into the higher-level quote...

> > > A
> >
> > B
> C
>
> D

A

B C

D

Do note how B and C came up on the same level of quoting but D actually got the dedent...

pradyunsg commented 8 years ago

Maybe having a --no-replace option is needed, but if so it should apply to both --target and non---target installs.

:+1:

pfmoore commented 8 years ago

At the cost of being picky, a tiny markdown suggestion/request/tip

Thanks. I try to do "preview" but missed that.

pradyunsg commented 8 years ago

I'm will be starting my implementation work off master, on Monday. We're nearly decided on almost everything and even if @dstufft says we want deprecation, the new behaviour to be introduced has to be provided anyway.

Here's what I'm going to start implementing:

I think we decided we'll keep --upgrade around for now (for backwards-compatibility) but not about deprecation and eventual removal. Should it be removed using the normal deprecation cycle, starting v9.0 (I think it's al-right if we remove it in in 10.0/11.0...)?


As an aside, I was thinking, since this change will make the next major version an intentionally-backwards-incompatible release; Would it make sense to try to push for some other issues to be fixed in the same release? If so, are there any such issues?

It would help maximize the utility of our decision to break backwards compatibility.

waiting on @dstufft's comment

edit: Added "on Monday", moved stuff around.

pradyunsg commented 8 years ago

Hello.

Quick apologies for the lack of activity over the past week... Some other urgent work came up and took some of my time. Anyway, I have started to work on this issue's implementation.

njsmith commented 8 years ago

@pradyunsg: I don't understand what --no-replace is for. --target is a weirdo option that almost got deprecated a few months ago, and may or may not survive in the long term, so if it's for --target specifically then it's very low priority and I wouldn't worry about it for now.

pradyunsg commented 8 years ago

Currently --target has a dependency on --upgrade. The current (default) behaviour of --target is to not replace files and folders already in the target-dir. Passing --upgrade changes this to replace files and folders already in the target-dir.

Since install now defaults to replacing (read upgrading) packages by default, it seems to make sense to switch the default behaviour of install --target correspondingly. This would --upgrade a useless flag for --target, which is what we want (--upgrade becoming a no-op that would eventually be removed). Then, a new option would have to be introduced the current behaviour of --target. This is the --no-replace.

Then, for consistency, if --no-replace works with --target runs, it should also work with non-target ones. AFAICS, the latter is new behaviour.

I guess even if --target doesn't survive very long, it might make sense to have a --no-replace that works regardless of --target. I don't know if someone would want that functionality without --target though.

PS: Apologies for littering so many inline-monospace blocks.

pfmoore commented 8 years ago

I don't think --target (and specifically its current default behaviour) is important enough to warrant adding a new flag just to retain it. IMO, we just switch --target to replace by default, and lose the ability to only add new files (which seems likely to result in broken setups anyway).

Not upgrading an already installed package is a safe operation, but --target doesn't do that, because it doesn't have access to "what is currently installed" information.

pradyunsg commented 8 years ago

So, change the behaviour of --target to stop bothering about already-present directories and just go about replacing them, printing a message as it does so? Even no message printing?

pfmoore commented 8 years ago

Hmm, wait. Sorry, your description confused me (and I didn't go back to check the docs). Sorry. My above comment was wrong. What I should have said:

Currently --target doesn't replace stuff. That is necessary, because it cannot safely uninstall/upgrade (there's no installed package database with --target and no guarantee that a new version doesn't have a different set of files than the previous version). The current behaviour of --upgrade --target is (AFAICT) unsafe.

So --target should keep its current behaviour. This does make it inconsistent with the new install, but that's fine, it's for a completely different use case. I don't have a problem with --upgrade being removed, and as a result --target loses that capability - it's an unsafe operation anyway.

Given that I disagree with changing the default behaviour of --target, there's no need for a --no-replace flag.

I'm not sure what you mean by --target having a dependency on --upgrade.

pradyunsg commented 8 years ago

It might help the discussion to read the current help text of --target.

""" Install packages into

. By default this will not replace existing files/folders in . Use --upgrade to replace existing packages in with new versions. """

I'm not sure what you mean by --target having a dependency on --upgrade.

To enable replacing existing stuff.

Given that I disagree with changing the default behaviour of --target, there's no need for a --no-replace flag.

If the behaviour of --target is not changed, it would mean --upgrade flag would need to stay at least as long as --target is there.


I want to remove the need for referring to --upgrade in --target's help.

pfmoore commented 8 years ago

OK, let me rephrase. The behaviour of --target should (IMO) be changed in one respect only, that --upgrade (and the behaviour it enables) should be removed.

If someone can demonstrate a use case for --upgrade (given that it potentially breaks things) then I'm willing to review that position, but I don't think it's worth keeping "just in case".

pradyunsg commented 8 years ago

The behaviour of --target should (IMO) be changed in one respect only, that --upgrade (and the behaviour it enables) should be removed.

Okay. That makes it clear.

If someone can demonstrate a use case for --upgrade

Not me.

njsmith commented 8 years ago

That sounds fine to me too. It strikes me as a nasty wart that --target used --upgrade for this purpose in the first place.

pradyunsg commented 8 years ago

I think we should move the further discussion over to #3806 to avoid having 2 comment threads with simultaneous discussions on the same thing.

rbtcollins commented 8 years ago

Wow this thread has gone critical. Let me just add my strong opposition to changing the meaning of -U. There's absolutely no need to break our users muscle memory - we can add a new option if we need non-recursive upgrades. That said, whats the use case for non-recursive upgrades other than 'pip install named-thing' ?

E.g. I think its fine to say that explicitly named distributions upgrade implicitly, and -U if provided causes fully recursive upgrading. in all cases without --ignore-dependencies, pip will recursive check for satisfaction.

njsmith commented 8 years ago

@rbtcollins: "breaking our users muscle memory" seems a bit strong -- WRT -U, the changes in the current proposal would be: (a) pip install -U foo is still legal and still upgrades foo, but now non-recursively, (b) it loses the special behavior where combining -U plus --target means "overwrite any existing files". I'm guessing that the latter change is not one that worries you overmuch given that you recently tried to deprecate --target and that most users don't have muscle memory for -U --target (I hope!!). So I guess you're saying specifically that you prefer that pip install foo do a non-recursive upstall of foo, and that pip install -U foo do a recursive upstall of foo?

I could live with this, especially as a transitional state where we deprecate -U at the same time, but it definitely has downsides:

So I'd much rather we move on and make pip install foo / pip install --upgrade foo do the obvious thing that everyone else does. I think most users' muscle memories will actually be pleasantly surprised to start getting what they were hoping for in the first place :-).

pradyunsg commented 8 years ago

@njsmith's comment provides a nice summary of why we're doing this.

I guess I should link to https://gist.github.com/pradyunsg/4c9db6a212239fee69b429c96cdc3d73 from here. This is the final "proposal" I wrote, that came out of this issue's discussion. It's got a section about "Current State of Affairs" that I think @rbtcollins would like to read.

rbtcollins commented 8 years ago

@njs - I don't think its too strong: right now, folk know that to get the latest across the board they run 'pip install -U' X. Thats the only reason to run install -U ever (today), and so breaking it is breaking its primary use case.

The behaviour with --target is indeed not the case I'm worried about.

FWIW I disagree with your analysis about what people do/don't want. Most projects only test a small number of permutations of versions: latest-with-latest + latest-with-stable, when a stable exists. Upgrading everything is actually safer that upgrading only the named component because folks lower version specifiers are usually wrong. See #3188 for an enhancement that would make testing lower version limits much easier. I have lost count of how many times I've 'fixed' folks problem by telling them to 'pip install -U' : they've had a package with an incorrect lower minimum.

The actual underlying thing that drives your 'this is wrong' is #2687 as far as I can tell - thats where pip can do the right thing.

Further, the very last thing we want is for pycrypto and friends to stay un-upgraded for months or years because folk don't know they have to do something special to have up to date secure software.

If folk are running very complex venvs, they are opting into the complexity - the common cases are a) full Python installs and b) dedicated venvs. We should steer everyone to b as much as possible because its inherently more reliable, and that strengthens the argument I'm making that the default should be to be secure, and running as close to what upstream will have tested as possible.

w.r.t. package managers - 'apt install X' will never upstall - it only installs. 'apt upgrade' is global - it upgrades everything'. DNF is similar AIUI. I haven't canvassed suse's tool, but I'd expect similar behaviour because of the flattened there-can-be-only-one idiom distros use.

Perhaps we should make a higher bandwidth discussion for this? It seems to be pointed in a pretty dangerous direction IMO.

@pradyunsg your assertion about pip's current status in https://gist.github.com/pradyunsg/4c9db6a212239fee69b429c96cdc3d73 is factually incorrect: there is already --no-dependencies switch which covers off the recursive/non-recursive case. 'pip install -U foo --no-deps && pip install foo' should be semantically equivalent to the 'upstall named packages by default' - and I'm fine with that.

pradyunsg commented 8 years ago

No tl;dr. Read it.


@rbtcollins

your assertion about pip's current status in https://gist.github.com/pradyunsg/4c9db6a212239fee69b429c96cdc3d73 is factually incorrect: there is already --no-dependencies switch which covers off the recursive/non-recursive case

I never asserted that pip does not provide the possibility to do non-eager upgrades or that there is the lack of a --no-deps in the write-up or (in my memory) anywhere else. Which part of my "assertion about pip's current status" do you feel is "factually incorrect"?

Do consider re-reading this section and explicitly pointing out of any "factually incorrect" points in a comment on the Gist (not here, it'll be noise) so that I can correct them.

Thats the only reason to run install -U ever (today), and so breaking it is breaking its primary use case.

No one's going around breaking the world.

FWIW I disagree with your analysis about what people do/don't want. Most projects only test a small number of permutations of versions: latest-with-latest + latest-with-stable, when a stable exists.

If the package developer provides poor metadata, it is not wrong behaviour on pip's side that it broke the user's environment because of that. It's the responsibility of the package developer to provide proper version constraints. I do agree that #3188 would help the package developer do so.

I don't think it's wrong to expect people to improve the metadata they provide to PyPI (and hence pip).

the very last thing we want is for pycrypto and friends to stay un-upgraded for months or years because folk don't know they have to do something special to have up to date secure software.

Agreed. I do think that if it's secure software, there's should to be extra attention given to the security packages. Moreover, any packages that are skipped from upgrades are explicitly listed as such. So, someone looking at the output would know what's happened and determine if they wish to take action.

If you care about a security package, after this change, you can simply mention it directly on the CLI, which makes your intentions more explicit and clear. I prefer it this way. This change would force you to mention which packages you care about being up to date.

"explicit is better than implicit"

If folk are running very complex venvs, they are opting into the complexity - the common cases are a) full Python installs and b) dedicated venvs. We should steer everyone to b as much as possible because its inherently more reliable, and that strengthens the argument I'm making that the default should be to be secure, and running as close to what upstream will have tested as possible.

I agree that everyone should be using virtual environments more often. I also agree that running as close to upstream as possible is also favourable. I find it ironic that you use the word "secure" to defend a behaviour that silently (and often) breaks the dependency-graph.

'pip install -U foo --no-deps && pip install foo' should be semantically equivalent to the 'upstall named packages by default'

It is. The whole motivation of this PR is to provide pip install -U foo --no-deps && pip install foo as pip install foo because the behaviour that everyone wants most of the time should be directly available. It was discussed and decided that it's better to not provide any way to do eager upgrades.

The actual underlying thing that drives your 'this is wrong' is #2687 as far as I can tell - thats where pip can do the right thing.

It has been concluded in prior discussions (#59, at pypa-dev) that the behaviour in #2687 is not fixable until #988 lands, which may will take a fair bit of time, and this behaviour is seen as the safer-middle-ground in the mean time.

Today, every time someone runs pip install -U pkg, they risk breaking some other package in their environment. While there will still be the same risk even after this PR, the number of times that pip's actions result in the environment breaking are reduced.

I'm fine with that

You're fine with having it as an opt-in behaviour to do non-eager upgrades. I'm not fine with breaking the user's environment silently, by default. That's what eager upgrades do as I see it, with #2687 unresolved.

It would be better to not be breaking the user's environment silently. This change is the best we can do for that given the limited development time that gets directly invested in pip.

Perhaps we should make a higher bandwidth discussion for this?

That's the idea behind the "shout-out" on distutils-sig.