Open zmt opened 7 years ago
@pradyunsg @dstufft I'm keen to implement this (mainly so we can support other auth schemes with Azure Artifacts when the Python support goes public).
Will you be at the sprints at Bloomberg in a couple of weeks?
I will be, in London.
Okay, good to know.
Our main hope is to support git credential helpers (see https://git-scm.com/docs/api-credentials and https://git-scm.com/docs/git-credential), in no small part because we already have a compatible credential helper that will work for us :)
I'm not sure how orthogonal that is to enabling the auth class to be overridden, but we're willing to contribute both. Just want to get some guidance on how many places you think this ought be touching, or any preferences as to how it's added.
Just posted the same proposal for twine, as it didn't appear to be there. Hopefully we can design something that works similarly for both.
Rather than going directly to the external process credential helper, having a (common) Python interface would be just as easy. Then perhaps we can publish our authorization tool in a package and installing that could enable auth for certain URLs automatically? That may satisfy all the needs here, without having to support any particular external helpers in pip/twine themselves.
@zooba do you mean publishing a dist which registers itself within a certain entrypoint pip would recognize?
@webknjaz That's what I'm thinking. An entrypoint or something similar would be fine, as that way pip install <cred-helper>
could automatically light up without the user also having to also add more command line arguments.
Okay, looks like this is basically adding keyring support, so I filed #5948 (basically, when we can't handle the 401 response from pip's cache, go through keyring first before prompting the user).
I believe that will also work for the original use cases? Keyring has extensible backends, so it would mean installing another package that includes keyring before installing the package. @zmt - thoughts?
I believe that will also work for the original use cases? Keyring has extensible backends, so it would mean installing another package that includes keyring before installing the package. @zmt - thoughts?
I haven't thought about this really since 2017. The addition of keyring support is great, but doesn't appear to help with SSO or using ssh certificates from an ssh-agent, which were 2 authentication methods I initially had in mind back then.
Wouldn't it be the best solution if pip just allows to provide a custom requests.auth.AuthBase? There are already a few useful auth implementations for requests like OpenId and Kerberos, see https://requests.readthedocs.io/en/master/user/authentication/
Is anyone already working on a solution?
@fedorbirjukov If anyone is actually seeking a solution, they are not doing it publicly :) Go head and work on it!
I don’t think it’s a good idea for pip to provide direct access to requests.auth.AuthBase
; pip using Requests usage should be treated as an implementation detail. An intermidiate abstraction would be needed.
I created PR #8029 based on PR #3731. Fingers crossed. UPDATE: closed it after having a closer look.
I also opened a PR #8030 that is related to this.
My change adds a --extra-headers
option to pip commands that enhances the PipSession
object with arbitrary headers so you can do things like token-based authentication.
E.g.:
pip install \
--extra-headers='{"Authorization": "..."}' \
--index-url https://secure.pypi.example.com/simple \
--trusted-host secure.pypi.example.com \
fizz==1.2.3
I’ve cleaned up the previous comments a bit to focus this thread on the remaining this at hand: implementing a way to plug in custom authentication backends, to support using methods such as Kerberos (#6708) and Windows Integrated Authentication (#8163).
The solution will likely be some kind of a plug-in system, so a user can install a backend alongside with pip, and use a flag to tell pip to use that. So the next questions from what I can tell would be to a) come up with a design, and b) identify places that need to be pluggable. I’m marking this as deferred till PR since some actual code would likely be the easiest way to kick off the discussion.
I honestly think pip should look to git-remote-helper as a model for a possible solution here. Example usage could simply be something like this:
$ pip install my-private-package --extra-index-url s3://my_private_pypi_bucket/
When the "scheme" of the repository URL (s3 in this case) is unknown to pip, it tries to start a subprocess named something like pip-remote-s3
, whose executable would be located on the PATH due to the installation of some 3rd party helper. It then sends "commands" to the subprocess via stdin, much like git-remote-helper.
You could allow others to implement whatever custom auth mechanisms they like via one of these helpers, and users need to simply install said helper onto the PATH, then use the helper's corresponding scheme in the index URL. To be honest this isn't even custom authentication support per se, but more custom protocol support which would allow whatever authentication mechanism you'd like. pip install via SFTP? No problem!
I don't know exactly what the protocol between pip and the helper would look like, or what layer of abstraction it should lie on. Should the helper simply send PEP 503-style responses to stdout? Should we allow the helper to ask input from the user directly during pip commands? Should CLI options be passed from the pip command (something like --<scheme>-helper-options
), or should we limit helper configuration to its own devices, config files and the like? Just some thoughts, would like to discuss.
If we choose to go down this path I'd be happy to have a stab at a PR for it. I'm not familiar with pip's internals but I'd like to get involved.
@tharradine Good point. I've never used git-remote-helper, at least consciously. But its model seems to allow integrating completely different technologies.
I used git on Windows though. And Git has out-of-the-box Windows support, called schannel (Secure Channel). And that's what I'd like pip to have, too. But pip devs are reluctant to go down that road.
The twine project has a similar feature request: https://github.com/pypa/twine/issues/362
I wonder if this is a good candidate for a fundable packaging project. Both pip and twine use requests
internally, so it might be a good idea to build an entrypoints-based plugin system that can be used by both. I expect corporations would be the main users as well, so it makes sense to ask them for resources.
As already mentioned above (maybe too vague), requests already supports custom authentication handlers so you don't need some complicated process communication protocol: https://requests.readthedocs.io/en/master/user/authentication/
So in theory the user just have to configure a factory creating such an authentication (for example an auth.py file in the pip config folder returning a requests_ntlm.HttpNtlmAuth
). Pip creates an instance and passes it to requests.
That would be a really simple solution and has the benefit, that existing requests auth handlers can be used without modification.
We can theorise all day, but ultimately someone still needs to put in time and effort to write the code. Which is where funding comes into play.
I would expect that organizing funding for my proposal would take more time than implementing the solution...
We can theorise all day
That's kind of the point of these issues is it not? Funding is not a prerequisite to discussing design ideas, it is not even a prerequisite to an implementation - I've offered my time in a previous comment
If someone is willing to help with the configuration part in pip I can make a PoC.
I would propose something like PIP_AUTH_FACTORY/--auth-factory which should point to a Python file. This Python file has an auth
function (or other callable) returning an requests.auth.AuthBase
.
For example:
from requests_ntlm import HttpNtlmAuth
def auth():
return HttpNtlmAuth('domain\\username', 'password')
@schlamar I agree that a requests auth handler is a simple solution to the use case of authenticating to a PEP 503 repository over HTTPS. For many users I'm sure that is all they need.
Unfortunately I'm a bit more ambitious and would like a plugin system to not require the use of any specific transport or application protocol, not require the package repository to adhere to PEP 503.
Expanding on my S3 example above - I could have a simple repository being hosted simply on an S3 bucket - no custom HTTP endpoints whatsoever, no HTML files, all that's required is some pip-remote-s3
client-side script, which knows how to discover the dists. The subprocess communication protocol need not be "complicated" - in fact it can be even simpler than PEP 503's "Simple Repository API".
@tharradine I see. However, I think this should be discussed in a separate issue (support for custom protocols instead of custom authentication handlers).
@schlamar That's fair enough, I suppose the two concepts are not mutually exclusive and both solutions could well be accepted.
Things I'd want to see in any concrete proposal to handle this:
requests
to httpx
for our network protocol? It's not impossible that we would do this...)Reasons I think these are important:
It's really hard to thrash out this sort of "wider issue" in the context of an open source issue tracker/pull request workflow. That's where a funded project, with a clear scope and a remit to look at the broad implications, is a potential way forward for proposals like this. And where the use case is specifically around "corporate" infrastructure like private repositories, some sort of funding can help bridge the gap between volunteer resources who have no "itch to scratch" in this area, and businesses that depend on such support but don't otherwise have a means to influence what features get accepted.
Remember, the pip developer team consists of a very small number of wholly volunteer contributors. We're working on trying to make things more sustainable, but in the meantime we have to be careful how we manage feature additions. Funded developments is one way we're exploring of doing this.
(And yes, I understand that the above makes something that "seems simple" into quite a big project. I don't apologise for that - changes to pip can have a huge impact, and we owe it to all of our users to do our best to ensure they are well managed).
I imagine most of the folks interested in this are operating in a corporate setting, with infrastructure set up for running an internal PyPI.
That's a good audience to point to the fact that the PSF's Packaging WG has this listed as a fundable project: https://github.com/psf/fundable-packaging-improvements/blob/master/FUNDABLES.md#architecture-to-support-alternative-authentication-methods-in-packaging-tools
Please contact the Packaging WG by emailing packaging-wg@python.org to ask us to estimate how much one of these improvements would cost; we'll get back to you within a few business days.
I made an attempt at resolving this with minimal changes to pip itself: https://github.com/pypa/pip/commit/0205e2e7a18156972ca975baa404a01387123895
@pfmoore I'd love your feedback as to whether you think this would resolve the requirements you listed here.
My hope for this is that users would be able to supply completely custom authentication headers for AWS S3 or, say, Kerberos authentication over HTTP. All the implementation details would be up to the auth override module developer.
The basic assumption in my initial implementation about pip internals is that there will be a module with an "AuthBase" class to implement. This isn't strictly necessary, as it would also be possible to define class with __call__
"hook" supplied to MultiDomainBasicAuth which gets a first look at the request URL and returns 'None' if it's uninterested in the URL.
*
*
any OS, reallyDescription:
This is a feature request.
It would be super-awesome++ if pip supported custom authentication handler configuration so private pypi repositories are not restricted to http basic auth only. Basically, make MultiDomainBasicAuth the default and no longer the ONLY option in a PipSession as it is today: https://github.com/pypa/pip/blob/9.0.1/pip/download.py#L331-L332
This limitation prevents easy integration with stronger authentication (e.g. 2-way TLS, 2FA, etc.) and SSO schemes at enterprises with private pypi repositories. The lack of support makes basic auth credential distribution and leaking unnecessarily difficult problems to address and combat.