pypi / warehouse

The Python Package Index
https://pypi.org
Apache License 2.0
3.6k stars 963 forks source link

Uploads: Remove the `username=__token__` requirement? #15140

Open woodruffw opened 10 months ago

woodruffw commented 10 months ago

Now that 2FA is mandatory, username/password pairs can no longer be used for package upload.

This affords a potential simplification in the upload endpoint's credential format: the phony __token__ username is no longer needed for disambiguation, since all "passwords" are now just API tokens of the form pypi-....

Pros: Delete a small amount of code, remove a shoehorned special value, one less configuration step/variable.

Cons: Probably needs a bunch of doc updates, user benefit is marginal (?)

CCing @miketheman for opinions 🙂

miketheman commented 10 months ago

Now that 2FA is mandatory, username/password pairs can no longer be used for package upload.

We still have the BasicAuth security policy in place today, which will inform the user using user/pass about the new approach. It's currently post-validation, so it might be useful to move that all the way up front, and thus remove a lot of that logic.

Considering the dominating client for uploads is twine, it would make most sense to me to have that client evolve to:

I think it'll take a significant amount of long-tail time until users have safely migrated to whatever the new approach is. And yes, definitely need updates to all docs everywhere 🙈

In your Pros, you call out some code reduction - can you poiint out what would be able to go away?

di commented 10 months ago

Considering the dominating client for uploads is twine, it would make most sense to me to have that client evolve to:

Note that twine is still used for many third-party package indexes that aren't PyPI -- it likely can't drop support for username/password auth entierly anytime soon, but it could possibly change it's behavior by special-casing PyPI.

miketheman commented 10 months ago

it likely can't drop support for username/password auth entierly anytime soon, but it could possibly change it's behavior by special-casing PyPI.

I think @woodruffw is all over that already ;) https://github.com/pypa/twine/pull/1040

woodruffw commented 10 months ago

In your Pros, you call out some code reduction - can you poiint out what would be able to go away?

Emphasis on very small 😅 -- I was mostly just thinking about the 1-2 places where username == "__token__" is checked explicitly, and a few of the places where we unmash the basic auth/other credential input formats (if those are still different?). Realistically, this would be <10 lines saved, so this is barely a "pro."

dstufft commented 10 months ago

I think that nearly 100% of authentication happens by smuggling the token through basic auth as the password, and the basic auth stuff requires there to be something for username (even if it's the empty string). I guess we could just ignore the username and just let it be anything? Though it seems useful still to register explicit intent.

woodruffw commented 10 months ago

Yep, I believe it'd have to be the empty string in that case. So yeah, this probably currently has limited practical value.

miketheman commented 10 months ago

smuggling the token through basic auth as the password,

For reference, here's how it works:

https://github.com/pypi/warehouse/blob/5147ff20b8b9f1c23a992e77613693f466e58e9f/warehouse/macaroons/security_policy.py#L75-L78

and later, we check the username:

https://github.com/pypi/warehouse/blob/5147ff20b8b9f1c23a992e77613693f466e58e9f/warehouse/macaroons/security_policy.py#L44-L45

I'd be much happier to see the "basic" part go away, and for folks to use "token" or "bearer" method and skip the need for username extraction.

woodruffw commented 10 months ago

100% agreed -- maybe it'd be good to add stats/counters to those branches, to figure out when it'd be okay to remove the basic auth path?

dstufft commented 10 months ago

I suspect nearly 100% of uses currently use the basic auth branch-- afaik it's the only thing Twine supports, and the only thing most upload tools support. There's no standard around the bearer auth thing currently, so upload clients would have to special case PyPI to support it.

woodruffw commented 10 months ago

so upload clients would have to special case PyPI to support it.

I've been recently making other special-case changes to twine (for the PGP deprecation + hardcoded __token__ user on PyPI and TestPyPI), so it probably wouldn't be too hard to add an additional special-case for the auth mechanism. That wouldn't help with the very long tail, though...

dstufft commented 10 months ago

I got distracted by non OSS stuff so I hadn't been able to work on it, but I did have https://github.com/python/peps/pull/3172 I started on that would allow handling these cases without special casing them.