Open hickford opened 9 years ago
Mostly agree with @takluyver, but some notes:
Merging twine
into pip
doesn't add a new and distinct support burden to the PyPA team, because they're already maintaining pip
and twine
. Consolidation could even potentially simplify things from the maintainer side.
This thread has devolved into a discussion of these workflow tools, which is somewhat dismaying.
Fighting against tool bloat doesn't mean that I want to ask PyPA to maintain a poetry/flit/whatever competitor.
I'd adore pip publish
which builds distributions and does the twine
step. But citing that as "too much" is not a reason not to combine pip
and twine
.
All of the objections to merging twine
into pip
seem to be based on principles about what pip
should and should not do. But combining the tools is an eminently tractable step which would simplify people's lives.
If these waters are too muddy, I may open a new and distinct issue to suggest exactly that: combine pip
and twine
, name TBD as pip upload
, pip twine
, pip idontcare
, etc.
I'd point out that "The PyPA team" isn't really a distinct set of people, each project under the PyPA generally operates more or less independently except when we need to agree on some standard for interoperability. In this case, the folks maintaining pip are more or less wholly distinct from the folks maintaining twine.
That being said, as the original author of twine, it was generally my intention that once it baked a bit, we would add it into pip. That was my opinion and I don't know how the other pip maintainers feel about it.
One possible upside of keeping the two tools separate is that you can provide more focused defaults or tooling for specific tasks where the consumer and the producer may have similar tasks, but a desire for different defaults.
For an example, pip wheel
will produce a wheel for a given target (directory, package name, whatever) as well as for all of it's dependencies, with the ultimate goal that you end up with a directory full of wheels for the entire dependency chain. Now twine doesn't have a twine wheel
command, but you could imagine it gaining one that was intended to function similarly to python setup.py bdist_wheel
, just utilizing the PEP 517 hooks. In that case you probably wouldn't want to build the wheels for the entire dependency chain, just for the current project. Having separate tools makes it possible to optimize the workflows, whereas with a singular tool you have to juggle between competing desires more and can end up making a worse workflow for both groups of people.
That being said, obviously having two tools instead of one tool is also a specific trade off that says that pay extra complexity for authors (having to install another tool and use it) is worth it to remove some complexity from end users (removing commands they aren't likely going to require) and possibly shift some complexity around for authors (if they want to use pip wheel
instead of a hypthoetical twine wheel
they might have to disable consumer oriented features of pip wheel
).
Of course there are ways of dealing with all of those things too. You could move all of the "producer" commands under a separate command namespace, like pip twine upload
, pip twine wheel
. Another option is to just mash the two namespaces together and require flags to disable certain features for either producers or consumers. There are possibly other solutions to this fundamental idea that producers and consumers might want different things from their tools, and how we rectify that. Trying to move forward is likely going to require someone coming up with a full proposal and getting everyone agreed on it, and then making a PR that does that agreed upon behavior.
In this case, the folks maintaining pip are more or less wholly distinct from the folks maintaining twine.
But I thought all of PyPA was just maintained by @dstufft ???
Kidding aside, thanks for pointing this out. As is perhaps obvious, I thought that they were the same people.
[the argument that] extra complexity for authors (having to install another tool and use it) is worth it to remove some complexity from end users (removing commands they aren't likely going to require)
I would find this much more compelling if pip
didn't already carry so much cruft? I'm sure pip wheel
is essential to someone's workflow, but I also don't have much doubt that removing it to a standalone tool would be worth it.
There are two additional things that occur to me:
pip
. This matters. Adding things to pip
, even only in new versions which aren't shipped with python distributions, makes them much more "obviously correct" for novices and junior developers.twine
command remains a distinct command to invoke, we can talk about including it in the pip
package.One entirely practical issue with adding new commands to pip is that we're very stretched for resources on the project. We've got a lot of outstanding work to do rationalising and improving the existing codebase, and it's hard even getting that done. Something like merging in twine would be a lot of extra work, for relatively little benefit.
Having said that, I don't object to the idea of pip becoming more of a "one stop" solution for the overall package production and consumption cycle. It's mostly just a matter of prioritising how we spend our limited resources.
Would we get more contributors/maintainers to pip this is definitely something that could be done. In practice with the current bandwidth of maintainers (who do all work mosly part of their free time) there's little chance of this in foreseeable future. We need to remember that Python as a whole very rarely gets resources to someone work full time on projects and as such we are constrained on what we can/should take on. Again it's mostly about benefit, risk and reward triangle optimization.
I would question the "for very little benefit" part, though. Right now, anyone coming to Python is taught to use Python and pip, and then they hit a brick wall when they actually want to publish some code (no matter how small and ninche-useful it is =). So at the very least, if the time and effort required to get pip publish
to do what twine does right now is too much, then can we add "just parsing" for the command pip publish
and have it print some instructions on what people should be doing instead?
E.g.:
$ pip publish
pip is python's dedicated install manager, and does not contain any code to help
facilitate the publication of packages. That task is handled by twine, which can be
installed using pip through "pip install twine".
$
And then bonus points for not leaving the user hanging:
$ pip publish
pip is python's dedicated install manager, and does not contain any code to help
facilitate the publication of packages. That task is handled by twine, which can be
installed using pip through "pip install twine".
Would you like to install twine? [Y/n] _
(Even more bonus points for "[...] does not contain any code to help facilitate the publication of packages at this time" of course =)
And sure, that means pip now advocates twine as tool of choice but something has to fill that role. No one is going to learn about twine, or wheel, or any of the tools that have come (and gone!) over the decades, unless they're either mentioned in the same breath as Python itself, or get a mention by those "not first party but for all intents and purposes, totally first party" tools in logical contexts.
Hello, I'm one of the two maintainers of Twine (along with @sigmavirus24). I'd personally be glad to see it rolled into pip publish
and would be happy to write up a plan on how we'd do this. I'd be happy to spend more time on pip, although I admittedly have been having a hard time committing time to Twine.
I'm also totally cool with pip publish
spitting out a helpful message while we figure this out.
I'm also willing to keep maintaining "twine" within "pip" and eventually working more on pip. I know @dstufft has been hoping to trick me into becoming a pip maintainer for a while, so I'm sure he'll be happy to hear that 😜
Wouldn't moving twine
inside of pip
require that pip
take on all of twine
's dependencies?
That means twine
would need to either vendor all their dependencies or pip
and thus CPython
would need to bring in a whole bunch of new dependencies. That seems less than desirable.
I think that is just the tip of the iceberg in terms of complexity added when you need your thing that installs packages to also upload packages, and I don't really see why they need to be combined. I don't think pacman
should be able to upload packages to the arch repositories or the AUR. I don't think apt
should be able to upload packages to Debian or Ubuntu's package repositories or PPAs.
If a pip publish
command were implemented (and again I don't see why we need to do this, particularly because we are still in the process of telling people not to use setup.py upload
...), I think it should be a thin wrapper around twine
, which remains a separate package. pip publish
would create an isolated build environment, install the latest version of twine
in it, then pass the command through to twine. That would avoid most of the problems with combining pip
and twine
while also allowing people to use pip publish
as an alias for twine
.
I'll let the pip/twine maintainers worry about the technical bits, but regarding this part:
I don't think pacman should be able to upload packages to the arch repositories or the AUR. I don't think apt should be able to upload packages to Debian or Ubuntu's package repositories or PPAs.
Those distributions use a model where packaging is an arcane art practiced by an elite group of wizards, not something ordinary people do. PyPI is not that. PyPI is a community index intended to be welcoming to everyone.
And, at a more mercenary level, it's to our benefit to make it easy for users to become maintainers, because hooboy could we use more maintainers.
I would be -1 on a wrapper around twine where you have to install twine first. That just seems like a ton of extra complexity to allow people to invoke twine as pip publish
instead of twine
. If we're going to add pip publish
then it should be part of pip.
And yes, we'd have to pull in twine's dependencies, which currently are:
pkginfo >= 1.4.2
readme_renderer >= 21.0
twine check
, which I think would be reasonable to have as a dedicated utility ~forever since it's more of a linter than a packaging tool.requests >= 2.5.0, != 2.15, != 2.16
requests-toolbelt >= 0.8.0
setuptools >= 0.7.0
tqdm >= 4.14
and optionally:
pyblake2
(on Python < 3.6).
keyring
Really I just go back to what we think provides the better UX. Like we're probably never going to fold twine check
into pip, it's dependencies are too heavy and it's getting too out of scope (IMO). So if people want to be able to lint their packages they're going to have to install another tool. Does it make sense to keep "producer" activities like uploading, building, linting, etc all focused into a singular tool, or does it make sense to roll the "mandatory" producer tooling into pip and spin out the more optional producer tooling into stand alone tools?
One question would be if there are other solutions that can fix the desire to roll tools into pip. Why is that better? Is it just because pip is installed by default? What if we made it easy to install those other tools by default? or even just installed them by default?
I don't honestly know the right answer to these questions! A lot of it is just someone who feels strongly about one path or another to figure out these answers.
Those distributions use a model where packaging is an arcane art practiced by an elite group of wizards, not something ordinary people do. PyPI is not that. PyPI is a community index intended to be welcoming to everyone.
My point is that not everyone assumes that a package installer is an "everything box" that installs packages, builds them, uploads them, creates virtual environments, runs tests, runs your formatter, etc. As far as I can tell, that will only be the assumption for people coming from specific communities.
"pip is the tool that interacts with PyPI" is a reasonable mental model if we had built it that way. It's reasonably well-scoped, but that's not really how pip
was developed and all our existing materials out there have one workflow for package installation and one workflow for publishing packages.
And the fact of the matter is, we really have to worry about churn. We're doing a very delicate and ambitious thing here, which is making radical changes to how packages are built and distributed fairly quickly. People are already very confused by when to use setup.py
and when to use pip
and when to use twine
. Changing from "you should use twine" to "you should use pip publish
" at this point will confuse huge numbers of existing people (plus all the people learning from tutorials based on the old best practice), for the benefit of new people who are used to a different code community that happens to use a different kind of build and packaging workflow. Those same communities also tend to have linting, testing and other related operations built into their "everything box", so now those people are still confused as to why it's "pytest" or "tox" instead of "pip test".
I think the best thing to do is to keep these tools well-scoped and if one or more people wants to build meta-tools that orchestrates them into one toolkit, we can link to the high quality ones from the packaging tutorials.
requests >= 2.5.0, != 2.15, != 2.16 We already require requests, so this is basicaly a no-op.
Presumably twine
needs to be refactored to use the vendored version of requests
, no?
There's basically no option here were we just plop twine inside pip and call it good. Rolling twine into pip would likely be pulling twine is as part of pip, not a dependency, and modifying it to fit with pip itself.
That means twine would need to either vendor all their dependencies or pip and thus CPython would need to bring in a whole bunch of new dependencies. That seems less than desirable.
Yep. IMO, this is the biggest question that should be asked when suggesting functionality gets added to pip. There's actually a lot to be said for an approach where we ship a "mini-pip" that has the bare minimum functionality necessary to bootstrap the packaging infrastructure (essentially, nothing more than the ability to install wheels, plus a plugin architecture that lets additional installs hook in new commands or extend existing ones.
Having said that, as @dstufft pointed out, adding twine's dependencies (specifically) isn't that big of a deal. (Probably because pip already has quite a lot of dependencies itself... ;-))
I would be -1 on a wrapper around twine where you have to install twine first. That just seems like a ton of extra complexity to allow people to invoke twine as pip publish instead of twine. If we're going to add pip publish then it should be part of pip.
To be clear, my suggestion was that pip
would do the twine
installation for you, behind the scenes. When you invoke pip publish
it would check if twine
is the latest version (with some cache), then install the latest version into an isolated build environment that is saved for the next invocation. Essentially it's ensuretwine
using the existing build isolation mechanism to avoid putting twine
into your user's normal Python path.
You can also skip this whole thing if twine
is already installed.
That said, I am just clarifying what I meant, I still stand by my "publishing doesn't really belong in pip" stance.
Here are my very brief thoughts:
pip publish
would be great. There's one other option here that we're ignoring: Twine is working on a real darn-tooting API. If we wanted pip publish
to work, Pip could be a consumer of that real life API. I think there are even potentially good divisions inside the code-base that Pip could leverage itself. That said, the API is still in progress. Feedback from the pipfolk would be very welcome.
Twine is working on a real darn-tooting API
That's another point of conflict with pip. Pip doesn't expose any sort of public API. It would be odd (that's the kindest term for it ;-)) if pip exposed a publishing API but nothing else, so merging twine into pip would lose that. Using twine as a library, with pip publish
as a consumer of its API is a different question, and not one I feel particularly qualified to comment on without more research.
I do think not having to think about pip, twine, flit, poetry, pipenv, etc. would make peoples lives easier
Maybe. But the muddle of commands is at least in part due to a muddle of concepts - application, library, publishing, deployment, dependencies, requirements, pinning, ... I think that we need to start by getting our conceptual framework cleaner - otherwise, we're treating the symptoms rather than the cause.
Also, I have technical concerns about pip absorbing all of these roles. There are a lot of aspects of pip that reflect its somewhat unique constraints and background (vendoring requirements, lack of an API or a plugin architecture, ...). I'm not sure those features fit well with a modern, flexible package management command (however, I will admit that I have essentially zero experience in how other languages address these issues).
I think that we need to start by getting our conceptual framework cleaner - otherwise, we're treating the symptoms rather than the cause.
💯 True.
After thinking more, I want to "take backsies" on a lot of what I've said in this thread.
If installing python gave everyone pip
and twine
, the conversation would probably be much more centered around how to improve twine
and its various "package publishing features" than whether or not to take the cosmetic, relatively insignificant step of coalescing the commands.
The concept of combining them into pip twine
or pip publish
is partly a proxy for the more essential "available everywhere (modern) by default".
The python stdlib provides unittest now, but pytest is still great. pip is a good installer, but pip-tools
, pipenv
, &co. will continue to exist and thrive.
I want a "unittest of package publishing" -- something unobjectionable, simple, good enough, and available in all modern environments by default.
And maybe that mandates that there's some twine init
wizard that helps you get started.
It can be invoked as twine
or pip publish
or pyproject-toml-manager.pl
or whatever. I think the whole matter of naming and whether or not it should be part of pip
is a (super tempting) bikeshed masquerading as the real issue.
For my own part, I don't even care how much I disagree with its "sane defaults". Even if I disagree with most of the decisions made by such a tool, I will still be extremely happy that it exists at all.
TL;DR version of the below: +1 from me for a pip enable-publishing
command that simplifies the account management and local environment management related steps in https://packaging.python.org/tutorials/packaging-projects/#uploading-the-distribution-archives, but I'm still -1 on offering pip publish
itself.
Personally, I think it's incredibly bad practice to have software publication tools installed by default alongside a language runtime, and consider it a design mistake that the Python standard library currently includes support for the legacy distutils setup.py upload
command (unfortunately, it's a major compatibility issue to get rid of it, and in the environments where folks care, they tend to just remove the entirety of the distutils
module).
In addition to the Linux comparisons @pganssle already made above, on other systems, it would be akin to making Visual Studio a mandatory part of Windows installations, or XCode mandatory on OS X. The vast majority of Python users aren't going to be Python open source project publishers and that's OK.
Even for folks who are Python publishers, the majority of their Python installations still aren't going to be development systems or build servers, they're going to be runtime environments (which ideally wouldn't even have pip
inside them, but there are currently logistical challenges to achieving that).
However, I think @pganssle is right that there are potential opportunities to take inspiration from the ensurepip
module in the standard library: we use that to standardise the process of bootstrapping pip
, without having to incorporate pip
itself directly into the standard library.
If we were to go down that path, then the appropriate command to add at the pip
layer would be something like pip enable-publishing
that went through the following steps:
twine
and keyring
into the current environment for you (as user installs if there's no venv active)twine upload dist/*
(The reminder at the end could potentially do a bit of filesystem introspection, and decide what to emit based on whether it finds a dist
directory, pyproject.toml
, setup.py
, or none of the above)
The general idea would be that the account setup related parts of https://packaging.python.org/tutorials/packaging-projects/#uploading-the-distribution-archives would largely be replaced by "Run pip enable-publishing
".
(Writing this up also made me realise one of the big reasons why npm
is able to do this differently: there's a much stronger distinction in that ecosystem between the development runtimes used to emit minified JavaScript, and the browser and in-app runtimes that execute them, which means it's far less likely that npm
will end up installed into a runtime environment by default)
Thanks for stating that last point about npm
(the parens make me think you considered omitting it) -- I think it's a very relevant. Whether or not we agree on what should be done in python, it clarifies "this is why npm is so different".
pip enable-publishing
is fine as proposed. It would be better than where we are now. But I want more. (I'm a needy package publisher! 😛 )
In addition to the Linux comparisons @pganssle already made above, on other systems, it would be akin to making Visual Studio a mandatory part of Windows installations, or XCode mandatory on OS X. The vast majority of Python users aren't going to be Python open source project publishers and that's OK.
I don't see why this would be a bad thing. What are we asking non-publishers and server runtimes to give up in order to have this, other than disk space?
It's a huge quality of life improvement for publishers, at little expense to anyone else.
If I had apt make-deb
and yum make-rpm
I'd be happy, and what would Ubuntu desktop users lose?
Personally, I think it's incredibly bad practice to have software publication tools installed by default alongside a language runtime
Why is that? Comparison with Rust + Cargo is relevant here. Has Rust made a mistake including Cargo by default, or is there some circumstance that makes python different?
The desire for a minimal standard lib sometimes will run hard into having a rich, batteries-included standard lib. If the standard lib is going to include mock
, 2to3
, and xmlrpc
, why is twine
so different?
Perhaps entering dangerous waters, but if venv
and zipapp
exist, shouldn't twine
or ensuretwine
be there too?
I think I am -1 any sort of short cut for pip install twine keyring
or whatever. That feels like the kind of "magic" that will add further confusion to people when they don't fully understand what that short cut is doing.
I view pip as a more developer centric tool than an MSI installer or apt-get
or yum
. Long term random end users should probably not be pip installing things, but should be installing "distribution aware" packages. IOW our toolchains should be improved so we can produce MSIs, .deb, etc as a matter of course. Likely ones that include Python themselves as part of it, which utilize internally pip or the like in part of the build toolchain.
Of course there are always going to be cases where the above isn't the right answer, e.g. installing a requirements.txt
into a Heroku environment.
That ultimately ends up being exactly the same case as npm
is in. If you're installing just plain old Python, you're going to get pip
installed just like if you're installing plain old Node.js you're going to get npm
installed. If you're producing something like a bundled app, then your bundled app is unlikely going to include pip
.
Ultimately, I think trying to come up with rationalizations for for why it "belongs" in pip or "belongs" preinstalled or why it doesn't the wrong way to think of it. Plenty of people argued bundling pip with Python was the "wrong" thing to do for a variety of reasons, and they were, IMO, wrong at the time. What I think matters if what we think provides the best experience instead of trying to be hardline about some idealistic vague "rules" about where stuff "belongs".
That being said, I don't know if merging the tools is the "right" thing. It would represent trade offs at the sub command level instead of trade offs at the top level command level. One tiny example, currently I have a single version of twine installed but many versions of pip installed, bundling them would make it harder for me to ensure I'm using the latest version of the publishing tools, since I have to update them in every virtual environment instead of just once at the top level. Obviously merging the two tools provides some benefits as well, since it makes it easier for users to know what tool they need to use, but I don't think it's a slam dunk and there may be ways (like also pre-installing twine) that we can use to negate the downsides of two commands, while still getting the upside. Or maybe just merging the two commands is really what's easiest for users. I dunno! I've been steeped in packaging lore for so long it's hard for me to step back.
Comparison with Rust + Cargo is relevant here. Has Rust made a mistake including Cargo by default, or is there some circumstance that makes python different?
Rust has no language runtime, it produces native binaries, so the distinction is even stronger there between tools for users and tools for producers. Though I am less sure than Nick about the degree to which that makes the difference, it's undeniably true that users of cargo are definitely producing software projects. Users of pip may be doing many other things, installing software, creating an interactive console, etc.
Comparison with Rust + Cargo is relevant here. Has Rust made a mistake including Cargo by default, or is there some circumstance that makes python different?
Experience with distutils says that putting publishing tools into the standard library (where they can't change rapidly in response to changes in the publishing ecosystem) results in problems. The situation isn't necessarily the same here, and it's possible that people are over cautious because of the history, but conversely, the benefits are small.
Putting pip into the standard library was a huge step, because it essentially solved the bootstrapping issue of how to get other tools. But with pip, getting a publishing toolchain is nothing more than pip install -r standard-publishing-toolchain.txt
. Sure, having tools built in is better (I'm normally an opponent of the "it's easy to just install stuff" argument) but is the benefit sufficient to justify the risks?
Regarding comparison with Rust, I see that @pganssle has already made some comments. Another factor to consider is that Python package producers routinely interact with non-Python tools (C compilers and libraries, tools like Cython, ...) As far as I know, Rust doesn't have to do that - and so Cargo doesn't have to react as those tools change. So yes, Python is different from Rust/Cargo.
I have to say I don't understand this part:
Personally, I think it's incredibly bad practice to have software publication tools installed by default alongside a language runtime, and consider it a design mistake that the Python standard library currently includes support for the legacy distutils setup.py upload command (unfortunately, it's a major compatibility issue to get rid of it, and in the environments where folks care, they tend to just remove the entirety of the distutils module).
While I share the opinion that legacy support for something like distutils should never have happened, I don't see the connection between the fact that that support is there, and the notion that it's somehow bad practice to have the publication tools bundled with the language suite. Those are two completely separate things, and I'd like to understand what makes you think it's bad practice to offer those tools as part of the standard library.
I'd also caution against drawing parallels between pip
and apt
/yum
. In part because they're only similar on the surface, in the sense that they're might all fit the "installation managers" label while differing substantially in context, but also in large part because Python is a cross-platform language: discussions about its package manager that require drawing parallels should draw parallels to other cross-platform programming language package managers, not to OS-specific installation managers (which gets even worse in the case of apt
or yum
, which aren't even Linux-specific, but "only certain flavours of Linux"-specific).
So that means comparing pip
, as a programming language dependency manager, to other such tools like cargo
or npm
. These tools of course have the benefit of being very new tools indeed, so there are lessons to be learned from the decisions they made after looking at what people want out of these tools, and what they actually get out of these tools, looking at all the languages that came before them, including how Python has handled package management. As it turns out, truly making these tools the package manager, not just the package installer (with up/downgrades just a more specific form of install), and having them be part of the default installation greatly benefits everyone.
So I'd like to understand comments around why it would be a bad thing to (in limited fashion from what I'm reading so far) effect this kind of empowerment for users of Python. The added disk space adds up to "basically irrelevant" except for embedded systems (where no sane person would use a standard install anyway), and it sounds like the maintainers of twine
are up for folding its functionality into pip
, so this all sounds like great stuff, and I still really hope to see a fully functional pip publish
come with a near-future version of Python, ideally with an interim solution in the very next version where pip publish
either tells people what to do, or asks them whether it should bootstrap everything for the user to, with minimal additional work, get that code pushed up and available to the rest of the world for use and collaboration.
(Note: thinking out loud in this comment so folks can see where I'm coming from in relation to this. I'll post a second comment describing a version of pip publish
that would address all my technical design concerns without the UX awkwardness of pip enable-publishing
)
The root of the design problem we face at the pip
level actually lives at the Python interpreter level: unlike other ecosystems, we don't make a clear distinction between "development environments" and "runtime environments".
C/C++:
Rust:
cargo
Java:
JavaScript:
node_modules
embedded in itPython:
So, in writing that out, I think my main concern is actually a factoring problem, in that I'd like users to be able to easily decide for themselves whether they want to configure a system as:
pip
, no wheel
, no twine
, no setuptools
, no distutils
(the first 3 of those are readily achievable today, the latter two are still a work in progress)pipenv
needs, for example, along with any other pipeline that converts Python packages to a different packaging ecosystem, whether that's a Linux distro, conda, etc)Point 1 is handled at the standard library level with ensurepip
(once we figure out the thorny mess of having the distutils
API be provided by setuptools
instead of the standard library)
That means it's only points 2 & 3 that impact the design of a pip publish
command. Saying "we don't care about the concerns of folks that want to minimise the attack surface of their build pipelines" is certainly an option, but I don't think it's a choice that needs to be made (hence the design in the next comment).
I realised there's a way of tackling pip publish
that would actually address all my design concerns:
pip
would declare a publish
extra, such that running pip install --upgrade pip[publish]
instead of pip install --upgrade pip
installed any extra dependencies needed to make pip publish
work. (Declaring an extra this way covers points 2 & 3 in my previous comment)pip publish
would be implemented using the in-progress Twine API @sigmavirus24 mentioned in https://github.com/pypa/packaging-problems/issues/60#issuecomment-447107759 (and presumably influence the design of that API)pip publish
would prompt to auto-install the pip[publish]
extra if it found any of its import dependencies missingpip
release cycleFrom an end-user perspective, that would all end up looking like an "on first use" configuration experience for the pip publish
command (which is always going to exist due to the need to set up PyPI credentials on the machine).
From a maintenance perspective, while the existence of twine
as a support library would become a hidden implementation detail, the twine
maintainers would likely still need to become pip
maintainers as well, so they can handle pip publish
issue reports, and bump the minimum twine
version requirement as needed (similar to the way @dstufft originally became a CPython core dev primarily to handle updating the bundled pip
to new versions).
As an added bonus, all the documentation about extras would gain a concrete example that it can point to: pip[publish]
:)
Is attack surface your main concern? Because python already ships with multiple ways to make arbitrary HTTP requests, and doesn't ship with any PyPI credentials. So I'm having trouble seeing how having twine available would increase attack surface in a meaningful way? What's your threat model?
@njsmith Every other part of pip
can interact with PyPI anonymously, but upload needs the ability to work with the user's PyPI credentials.
Not putting the credentials on the machine in the first place is obviously the primary defence against compromise, but if the code is useless without credentials, why have it there, instead of designing the tooling to add the dependencies at the same time as you add the credentials?
Keeping the dependencies separate also means that if a CVE is raised against the way twine accesses the system keyring, or the way it interacts with a user's account on PyPI, then it's only a vulnerability on systems that have twine installed, not on all systems that have pip installed. (A future version of pip would presumably raise the minimum required version of twine to a version without the vulnerability, but that would be a matter of dependency management hygiene, rather than urgent CVE response)
That said, laying out the considerations as I did above means I now think most of the cases where this kind of concern really matters will be ones where the feature to be removed from the deployment environment is the entire build and installation toolchain, and that's already possible by doing pip uninstall pip wheel setuptools
once the target venv has been set up (getting rid of distutils
is more difficult, but still possible).
So while I think the "extra"-based approach would be architecturally clearer (i.e. pip primarily remains an installation tool, but has some core publication functionality that relies on some optional dependencies), I don't think having it baked into the default install would create any unsolvable problems - worst case is that it would just give some folks an increased incentive to figure out how to remove pip from their deployment artifacts entirely, and there might end up being some future CVEs that impact more Python installs than they otherwise would have.
I like @ncoghlan's idea -- have a pip command that's providing (optional) upload functionality, implemented using twine's public API, with an extra in pip to install the dependencies for it. :)
It's been over 5 years since this issue got filed, and almost 2 years since the discussion died down and nothing happened. However, the entire world would still benefit from being able to type pip publish
, because publishing a package is still ridiculously hard in this ecosystem.
Just pick an approach, implement it, and then iterate on refining or even wholesale changing that implementation as the command sees adoption. As long as pip publish
works at all, improving how it works can be a rolling target.
flit publish
nor twine upload
are "ridiculously hard" by any standards, and if they are perceived as such it's a documentation issue, not a tooling issue.@astrojuanlu
- If nobody has complained about this in 2 years maybe it's not that crucial.
Why should one constantly add complaints if there's an issue open already? I guess only few would agree that Python packaging tooling is a pleasant thing to use. Besides, there are complaints now, and you're complaining about those. So maybe you should make up your mind on this matter.
- In 2020 neither
flit publish
nortwine upload
are "ridiculously hard" by any standards, and if they are perceived as such it's a documentation issue, not a tooling issue.
Oh come on, the tooling is really not great compared to what we're seeing e.g. with NPM. Nobody's saying that the pip
/ PyPA team hasn't been doing an amazing job, but in comparison to other ecosystems, Python is just so far behind.
Oh come on, the tooling is really not great compared to what we're seeing e.g. with NPM. Nobody's saying that the pip / PyPA team hasn't been doing an amazing job, but in comparison to other ecosystems, Python is just so far behind.
How many people work on and support npm? Wikipedia says "The company behind the npm software is npm, Inc, based in Oakland, California. [...] GitHub announced in March 2020 it is acquiring npm, Inc". The pip development team consists in total of about 5 people, all of whom only work on pip in their spare time. Frankly, I'd hope npm would be better than pip, with that level of disparity in development resource...
Most of the work in the Python packaging space appears to be - with the sole exception of the new dependency resolver - unfunded and is carried out by volunteers in their free time. npm was VC-funded as early as 2013 and is now maintained by GitHub.
Edit: heh, we posted almost the exact same thing at the exact same time.
Yes, I don't challenge that. This is a totally acceptable explanation for why Python packaging is in such a bad shape. But still one should acknowledge that Python packaging is not great by any standards. Why that is the case is a different question. I'm thankful for the work people have put into the existing ecosystem either way, but this doesn't mean one cannot dislike or criticize it.
That's a very absolute statement. There are certainly some standards by which Python packaging is fine:
Progress is slow. But it's not non-existent. And there are reasons why it's slow. People complaining that the volunteer labour "doesn't get things done faster" is one of the reasons it's slow, because it discourages and burns out the people whose freely given efforts are being dismissed as insufficient. I speak from experience here, as I know I'd do far more on pip if I didn't have to review so many issues that left me feeling demotivated.
this doesn't mean one cannot dislike or criticize it
However, finding ways to express such a dissatisfaction without implying some level of failure on the part of the people who voluntarily give their time to the work, is very hard. And people typically don't make any effort to do that, but simply throw out criticisms, and then follow up with "yes, but I appreciate the work people have done, I just dislike the result".
And furthermore, how is complaining and criticising without offering any help, productive? If you were to submit a PR implementing a pip publish
command that took the discussion so far into account, your views would be much more relevant and welcome. But just out of the blue commenting that "this sucks" isn't really much help in moving the issue forward.
Never mind. I don't want to spend my Sunday worrying about explaining this to people. I'll go and find something more enjoyable to do. (And if that means I don't work on pip today, that's a good example of the consequences of this sort of discussion).
And if that means I don't work on pip today, that's a good example of the consequences of this sort of discussion
+1
The fact that this was the first notification/issue thread I've read on this Sunday, is directly the cause of why I'm not spending any more time today to work on pip.
@pfmoore
Progress is slow. But it's not non-existent.
Nobody said that.
People complaining that the volunteer labour "doesn't get things done faster" is one of the reasons it's slow, because it discourages and burns out the people whose freely given efforts are being dismissed as insufficient. I speak from experience here, as I know I'd do far more on pip if I didn't have to review so many issues that left me feeling demotivated.
this doesn't mean one cannot dislike or criticize it
However, finding ways to express such a dissatisfaction without implying some level of failure on the part of the people who voluntarily give their time to the work, is very hard. And people typically don't make any effort to do that, but simply throw out criticisms, and then follow up with "yes, but I appreciate the work people have done, I just dislike the result".
I can very much empathize, I've been in your shoes before, many times. Maybe to clarify once more: I greatly appreciate the work and effort that people have put into PyPA and pip. But I think it's not okay to simply deny there are still many issues to be resolved when there clearly are issues. Because my impression was that this is exactly what was happening in response to @ArjunDandagi's and @Pomax's comments (and is the only reason why I joined the discussion)
And furthermore, how is complaining and criticising without offering any help, productive? If you were to submit a PR implementing a
pip publish
command that took the discussion so far into account, your views would be much more relevant and welcome. But just out of the blue commenting that "this sucks" isn't really much help in moving the issue forward.
First off, I never said "it sucks". Secondly, I believe it's a mistake to only allow criticism if the one criticising has a solution for their problem right at hand. One must be able to express dissatisfaction even if one doesn't know how to resolve the problem.
Never mind. I don't want to spend my Sunday worrying about explaining this to people. I'll go and find something more enjoyable to do. (And if that means I don't work on pip today, that's a good example of the consequences of this sort of discussion).
@pradyunsg
+1
The fact that this was the first notification/issue thread I've read on this Sunday, is directly the cause of why I'm not spending any more time today to work on pip.
You can spend your time however you want to. Nobody's forcing you to do anything.
If you were to submit a PR implementing a pip publish command that took the discussion so far into account, your views would be much more relevant and welcome.
Just to add, as a maintainer of various open source projects (not pip), a PR like this is probably not as helpful as it initially sounds. If you're not familiar with the internals of a project, your first attempt at writing a significant new feature is likely to need a lot of work, and therefore take up a lot of reviewers' time. It can also cost a lot of mental & emotional energy to explain to a well-intentioned contributor that the changes they've spent hours or days on are not going to be merged, and at least for me, this really drains my enthusiasm to work on a project.
So, before contributing pip publish
(or any other significant change to an open source project), it's a good idea to work out:
@hoechenberger
I believe it's a mistake to only allow criticism if the one criticising has a solution for their problem right at hand
You are allowed to criticise. @pfmoore suggested that it was not productive for you to do so. It looks like you've contributed to dissuading two maintainers from spending time on pip today, so I'd have to agree with him.
The issue is that criticising Python packaging has been done to death for years. Anyone involved in Python packaging knows there are still plenty of warts and areas for improvement. So another round of "why isn't this fixed yet?" without engaging with the details of the discussion is not actually driving anything forwards.
I will endeavour to resist the urge to reply again for at least the rest of the day.
Progress is slow. But it's not non-existent. And there are reasons why it's slow. People complaining that the volunteer labour "doesn't get things done faster" is one of the reasons it's slow, because it discourages and burns out the people whose freely given efforts are being dismissed as insufficient. I speak from experience here, as I know I'd do far more on pip if I didn't have to review so many issues that left me feeling demotivated.
As a maintainer of a project used by many (pytest), I definitely concur with this statement.
It would really help if people made concrete notes about what is not good in Python tools, and what is great in other tools.
That would be the comment that started this thread. In npm
land, which is probably the best publishing experience, there is one tool, and just one tool, and the mirror-sequence of steps is:
npm login
, until you uninstall NPM or reformat your machine or the likenpm init
to set up the publishing metadata, with a guided CLI experience that asks you for all the information requirednpm version major
, npm version minor
, and npm version patch
commands to make "sticking to semver" as easy as possible for maintainers, which makes it possible for people who use npm packages to trust that patch/minor changes don't break a codebase when uplifting (at least, to the degree where any package that breaks that trust is a genuine surprise).npm publish
. This command packs up your local files, with optional exclusions through either .gitignore
or .npmignore
, but that archive only exists in memory, for as long as it needs in order to be uploaded.This is essentially frictionless, through a single tool. Yes, a competitor was written to address NPM's slowness, called "yarn": but they quite wisely decided to make it work in exactly the same way, so if you're coming to Python from the Node ecosystem at least (or if you're a long time user of Python and you started working with Node), you are effectively spoiled with an excellent publishing flow and tooling for that.
There were dicsussions around having pip
"wrap" other tools, so that it could at least act as front-end for the publishing workflow and people would only need the one command: that would still be amazing. Sure, it would end up preferring one tool over another, but that's fine: folks who don't want to have to care, don't have to care, and folks who do care don't need to change their publishing flow and can keep using their preferred tools for (part of) the release chain as before.
There were dicsussions around having pip "wrap" other tools, so that it could at least act as front-end for the publishing workflow and people would only need the one command: that would still be amazing.
One suggestion - not intended as deflection, but as a genuine way for community members to help explore the design and maybe tease out some of the inevitable difficulties in integrating existing tools in such a front end. Maybe someone could build a tool that acted as nothing but that front end - providing the sort of user interface and workflow that node users find so attractive (I've never used node myself, so I have no feel for how npm "feels" in practice), while simply calling existing tools such as pip, twine etc, to deliver the actual functionality.
If we had such a frontend - even in the form of a working prototype - it would be a lot easier to iterate on the design, and then, once the structure of the UI has been sorted out to work in the context of Python, we could look at how (or maybe if) we would integrate the command structure into pip or whatever.
I think it is important to recognise that these complaints pertain to setuptools. Working with Flit and Poetry, which provide their own CLI, is not unlike working with npm. The addition of a pip publish
command will not meaningfully improve the situation with setuptools - not least because pip
does not have an accompanying build
command (there is a wheel
command but that only builds.... wheels) and neither does setuptools (the setuptools CLI is deprecated and slated for removal). There is work being done in this area but it is slow both for historical reasons and for lack of resources. python-build is a generic build tool which - as I understand it - will be adopted by pip once stable and blessed by setuptools. There has been discussion on improving the setuptools documentation and on adopting pyproject.toml
. There are several PEPs under discussion which seek to standardise on how to specify package metadata in pyproject.toml
. These are all things that open up new possibilities for pip and for other tools, like the hypothetical integrated frontend that @pfmoore has mentioned above.
I think it's probably also worth noting that a (small?) elephant in the room is that if you're coming to Python from another language, or even if it's your first exposure, you get told by nearly everyone that "how you install things" is through pip
. So even if in reality it's just one of many ways to install/publish packages, and some of the alternatives are quite streamlined, that's not what people are being taught to think of pip
as. It's essentially "python and pip" in the same vein as "node and npm" or "rust and cargo" etc. That's not something anyone can even pretend to have any control over at this point, of course, but it's a strong factor in what people new to Python, or even folks "familiar enough with python to use it on a daily basis alongside other languages" have been implicitly conditioned to expect from pip
.
Having someone write a "unified" CLI proof of concept tool sounds like a great idea, and I'd be more than happy to provide input around the "npm experience" (having published a fair number of packages there), although I would not consider myself familiar enough with the various tools (or with enough time to deep-dive) to write that PoC myself.
We had basically these same arguments about adding a pip build
command. I am personally way more in favor of @pfmoore's idea (and it's one I've suggested beofre) of having a top-level tool that wrangles all the various other tools for you. There's a bunch of complications that come with cramming a bunch of different tools (publisher, builder, installer) into a single common tool.
For example, the motivation behind the unix philosophy does-one-thing-well tools for building distributions and installing wheels is that many downstream distributors feel the need to bootstrap their whole builds from source, and it's a pain in the ass to bootstrap a swiss army knife monolith like pip
compared to small, targeted tools.
I also think that it's easy to look at cargo
and npm
and think that they have everything figured out, but these big monolithic tools can be a real problem when because of poor separation of concerns, nominally unrelated aspects of them become tightly coupled. I know a few places where we've had problems because cargo
didn't (doesn't?) support any endpoint other than crates.io, and in general they are also in the early phase, when there's still a lot to build out, and not necessarily a lot where they are suddenly constrained from making any big changes.
I'm not saying that those ecosystems and all-in-one tools are worse than what we have or even that there's no benefits to them, but in the past we had an all-in-one tool for this: distutils
. setup.py
was extensible, had a bunch of built-in stuff for building, testing, installation, etc. Over time bitrot, tight coupling and poorly defined interfaces have made it mostly a minefield, and on top of that it's just fundamentally incompatible with how software development works these days.
I think a bunch of individual tools with one or more wrapper CLIs for various purposes makes a lot of those problems much more tractable in the long term, and might help the people clamoring for a "single endpoint".
Having someone write a "unified" CLI proof of concept tool sounds like a great idea, and I'd be more than happy to provide input around the "npm experience" (having published a fair number of packages there), although I would not consider myself familiar enough with the various tools (or with enough time to deep-dive) to write that PoC myself.
To be honest, no-one had that sort of familiarity with the tools/ecosystem when they started. Why not just write a gross hack and see how things develop from there?
mypip.py
import subprocess
import sys
if __name__ == "__main__":
if sys.argv[1] == "publish":
subprocess.run(["twine", "upload"] + sys.argv[2:]
else:
subprocess.run(["pip"] + sys.argv[1:])
In all seriousness, that's the bare bones of a utility that adds a "publish" command to pip's CLI. Clearly, there's a lot of work to make even a prototype out of this, but if you started with that and actually used it, and improved it as you hit annoyances/rough edges, you'd pretty soon end up with something worth sharing. Most of the tools I've ever written started out like this.
(I'm not trying to insist that you do this - just pointing out that "I don't know enough" is actually far less of a hurdle than people fear).
Even after you've written a
setup.py
, publishing a package to PyPI is hard. Certainly I found it confusing the first time.The sequence of steps is complex and made stressful by all the decisions left to the user:
setup.py register
?)setup.py register
or by writing a.pypirc
?)setup.py upload
or withtwine
?)It would be neat to have a single command
pip publish
analogous tonpm publish
that did all of this, correctly.It would build whichever distributions are deemed fashionable (source + wheel). It you weren't logged in it would automatically run the wizard
pip register
.