pypi / warehouse

The Python Package Index
https://pypi.org
Apache License 2.0
3.55k stars 956 forks source link

Add support for API keys #994

Closed edmorley closed 5 years ago

edmorley commented 8 years ago

A scary number of people embed their PyPI username and password in their Travis config (using Travis encrypted variables), to enable automatic releases for certain branches (Travis even has a guide for it).

In addition, the packaging docs example encourages users to save their password in plaintext on disk in their .pypirc (they can of course use twine's password prompting, but I wonder how many read that far, rather than just copy the example verbatim?)

Whilst in an ideal world credentials of any form wouldn't be saved unencrypted to disk (or given to a third-party such as Travis) and instead users prompted every time - I don't think this is realistic in practice.

API keys would offer the following advantages:

  1. Higher-entropy credentials that are guaranteed to have not been reused on multiple sites.
  2. The ability to give the API key a smaller permissions scope than that of the owner's username/password. For example an API key would not be permitted to change a user's listed GPG key or in the future, their 2FA settings. Or an API key could be limited to a specific package.
  3. Since this would be separate from the existing username/password auth, a signing based approach (eg HMAC) could be used, without breaking older clients. This would ensure that if a connection was MiTMed (eg due to a protocol or client exploit), the API key itself would still remain secure.
  4. Eventually support could be dropped for the password field in .pypirc, leaving a much safer choice between password prompting every time, or creating an API key that could be saved to disk.
  5. If/when support is added for 2FA, users who need to automate PyPI uploads won't have to forgo 2FA for their whole account. They could instead choose to just create a 2FA-circumventing API key for just the one package that needs uploads in automation.

Many thanks :-)

(I've filed this against warehouse since I'm presuming this is beyond the scope of maintenance-only changes being made to the old PyPI codebase)

ewdurbin commented 5 years ago

I just realised that there's a patent 😨

Spoke with @VanL about this issue and we're looking into it, let's continue to discuss the technical details of API keys for PyPI here.

brainwane commented 5 years ago

@woodruffw's #6084 is ready for you to test it, if you're adventurous and you can set up a Warehouse developer environment! In case you haven't tested this kind of feature locally before, here's how:

  1. follow the setup guide up through the "Building the Warehouse Container" step (this will take a few minutes)
  2. fetch the branch to test locally by running git fetch upstream pull/6084/head:api-keys
  3. switch to that branch: git checkout api-keys
  4. go back to the setup guide and run make build again (may take a few minutes)
  5. keep going in the setup guide through "Viewing Warehouse in a browser" and log in as a user -- for instance, brainwane -- with password password
  6. go to Account Settings, verify that you see "Manage API tokens", click on that
  7. create a new API token and copy the resulting string
  8. make distribution packages -- for instance, you could clone the Forms990-analysis repository, update the version in setup.py, and make a wheel and sdist
  9. use Twine to upload your distribution to your local environment. Will says, "Our current auth-policy is drop-in compatible with Twine and distutils. When using a token, your "username" will be @token and your "password" will be the token itself." So, if your token is Ab9GpH-H5y your command will be twine upload --repository-url http://localhost/legacy/ -u @token -p Ab9GpH-H5y dist/* (but actual tokens are 160+ characters long).
webknjaz commented 5 years ago

@brainwane filed a bug: https://github.com/pypa/warehouse/issues/6262

brainwane commented 5 years ago

Upload-only API tokens (both user-scoped and project-scoped) are now in beta on PyPI and Test PyPI! Our update on Discourse is at https://discuss.python.org/t/pypi-security-work-multifactor-auth-progress-help-needed/1042/31 .

Uploading with an API token is currently optional but encouraged; in the future, PyPI will set and enforce a policy requiring users with two-factor authentication enabled to use API tokens to upload (rather than just their password sans second factor). Once the beta period for API tokens is complete, we will make a launch announcement on the pypi-announce mailing list, and start to notify project maintainers and owners of the upcoming policy change. Then, after a suitable waiting period, we will begin to enforce this restriction, and include a notice in the error message returned to clients.

davidism commented 5 years ago

Is there any chance of adding 2FA support for uploads, as opposed to only accepting tokens? Seems like 2FA should be supported and preferred for dev machines. Storing API tokens locally doesn't seem any more secure than storing the username and password locally in that case.

moshez commented 5 years ago

"more secure" depends, of course, on your threat model.

One common problem with storing secrets locally is that they are available to any future application that runs as the user (any current application that runs as the user is a threat for 2FA-based systems, since it can directly hijack the session). However, this threat can be mitigated by only storing short-lived tokens. The Macaroon system we are implement allows adding such validity caveats to tokens before storing them. For example, you could create a token valid for 5 minutes before each upload.

In addition, it is also expected and straight-forward to invalidate API tokens through the UI.

Can you indicate what threat model you think 2FA for uploads solves that short-lived tokens do not?

Carreau commented 5 years ago

PyPI will set and enforce a policy requiring users with two-factor authentication enabled to use API tokens to upload

Are there any discussions as to where how the project-name:api-token mapping should be stored ? Typically to tell twine which token it should use ? In CI it's easy with env variable; not so much on dev machines which may release multiple projects.

Or do you just expect devs to use token with the same scope as the user ?

davidism commented 5 years ago

So you're saying the workflow to upload a package from a dev machine would be to log in to pypi with 2FA, get a short lived token, and tell twine about it? Why not just cut out the middle step and tell twine about 2FA?

ThiefMaster commented 5 years ago

That sounds like a terrible idea to me - I really do not want to have to involve a browser (or enter my password in a CLI which would be worse since I generally do not know my passwords but generate them from a master password or randomly (and then store them in a password manager)) to publish packages. I think there are two usecases here:

When publishing to npm right now, everything works straightforward: I run npm publish, it tries to login with the access token stored locally, and because I have 2FA enabled it asks me for a 2FA token, then retries to publish using the access token and the 2FA token. This not only adds extra security but also prevents accidentally publishing something, since every time you publish you need to enter a 2FA token!

So ideally, I'd like to have the same behavior with pypi/twine. I wouldn't mind if it was internally using a long-lived token that requires 2FA in addition and created a short-lived token to do the actual publish. This would actually be convenient for cases where you publish multiple packages in one go so you don't need to enter multiple tokens (and even allow reuse of a TOTP which is probably a bad idea).

fschulze commented 5 years ago

This could be implemented by adding a 2fa caveat. In the UI one would create a token with a 2fa requirement (maybe with a simple checkbox). Then twine would see that token (the macaroons can easily be inspected), ask the user for the 2fa code, add it to the existing token and send that to pypi. The new token would expire automatically when TOTP is used and the token from the UI would be useless without the 2fa code.

I think this sounds sensible, useful and relatively straight forward to implement. Unless I overlooked something fundamental.

pganssle commented 5 years ago

One thing I would like to throw into the mix here with regards to 2FA for uploads is that #726 proposes a "two phase" upload, where packages are uploaded from the command line and (optionally) go into a "staging area" before they become available to the public and immutable. I will find that enormously useful for other reasons, but it also adds another workflow where the final upload is necessarily gated by 2FA - uploads into the staging area would be possible using the upload API key but none of that would actually be published without logging in with the 2FA key.

Obviously that doesn't help anyone who deliberately wants to avoid using the browser as any part of their upload workflow, but it may be more convenient than the "get a short-term key from the browser and paste it into twine" workflow.

woodruffw commented 5 years ago

Just 0.02c from the implementation side: yes, I think the right way to do this would be with an additional 2FA caveat as @fschulze proposed. That's out of scope for the current work (and would involve changes to twine and other uploaders), but wouldn't be too difficult to implement.

OTOH, single-use and/or time-scoped tokens (as proposed by @moshez) that require a second factor for minting would provide similar security properties and potentially be less invasive for automatic deployments.

webknjaz commented 5 years ago

@Carreau does it really make sense to have multiple tokens on dev machine? If yes, you could have multiple "repository" entries, one per each token/project. If no (better DX) — just use a user-wise token. There's been some discussion @ https://twitter.com/Ewjoachim/status/1154479563419869184

Carreau commented 5 years ago

does it really make sense to have multiple tokens on dev machine

For me yes it does. I only want my work machine to be able to publish some packages; and vice versa for my home machine. Also tokens are "upload only" (or are going to be); so I can keep my password safer.

For now I'm good with a custom solution but would love for an agreed upon way of doing it before various incompatible solutions emerge.

brainwane commented 5 years ago

Heads-up for people trying the beta of uploading with API tokens:

graingert commented 5 years ago

Heads-up for people trying the beta of uploading with API tokens:

  • The Travis problem in #6287 means we'll probably be changing the token prefix and the username you use for uploads. We may also end up changing other stuff. It's a beta!

That's ok, as long as you don't change existing tokens

brainwane commented 5 years ago

@graingert I'm sorry, but yes, we will probably be making so that tokens you have already created do not work. As the manager on this project I'm comfortable making that choice during this beta, since we have warned people that there was a chance this would need to happen during the beta. To quote @ewdurbin in https://github.com/pypa/warehouse/issues/6287#issuecomment-516858524 ,

We know who have provisioned API tokens and can email them to give them a headsup 24 hours before disabling the older grammar.

ewdurbin commented 5 years ago

We have updated the token username and prefix in #6342.

username: @token => __token__ password/token: pypi:<base64 token body> => pypi-<base64 token body>

These changes should alleviate the need for escaping heroics.

The previous format will continue to work for now, but users will be notified to update their configurations to match the new syntax before the beta period is over.

brainwane commented 5 years ago

@takluyver asked:

Is there any plan for an API to create upload tokens? E.g. I'd like to have a command-line tool prompt me once for my password & 2FA code, then obtain and store a project-scoped token to use for uploads.

Sorry for moving your comment here, @takluyver, but I want to keep this issue focused on API keys and that issue on the rollout!

We don't have a specific plan for that API feature yet, no. I filed your request as #6396.

brainwane commented 5 years ago

We've rolled out scoped API tokens for package upload on PyPI. It is in beta, and #5661 is a meta-issue where we are tracking its rollout and getting the last few items fixed before ending the beta, and the policy changes (requiring API token usage for some users) we'll make after that.

We've now implemented all the items in this API token checklist. Some features are out of scope for our current funding:

So, per agreement with other maintainers in that meeting, I'm closing this issue.

Please enjoy upload API tokens on Warehouse, and file new issues to request new API key-related features. Thank you all!