pypi / warehouse

The Python Package Index
https://pypi.org
Apache License 2.0
3.6k stars 968 forks source link

Protection for account resurrection attacks on GitLab OIDC Trusted Publishing #15643

Open facutuesca opened 8 months ago

facutuesca commented 8 months ago

(For context, a GitLab namespace is like a GitHub organization/user. E.g: samplenamespace is the namespace of the https://gitlab.com/samplenamespace/repository project.)

As specified in the GitLab OIDC publisher documentation, currently PyPI does not fetch the GitLab namespace ID when setting up a Trusted Publisher.

This means the claim verification is done against the namespace's (string) name. For example, setting up a trusted publisher for https://gitlab.com/samplenamespace/sampleproject//release.yml will check that the OIDC claims contain the namespace samplenamespace.

In contrast, the GitHub Trusted Publishing implementation fetches the org/user ID during creation of the Trusted Publisher, and checks the claim against that ID. This protects against account resurrection attacks, since a namespace can be deleted and a new one with the same name created, but IDs are unique.

Currently, implementing this the same way we do it on GitHub TPs is not possible due to a limitation of GitLab's API. It's not possible to fetch the ID of a public personal namespace (think username in https://gitlab.com/username/repository). See this comment for an explanation and more details of the limitation.

A possible solution for this could be implementing a TOFU (Trust On First Use) scheme, where the ID of the namespace is read from the claims of the OIDC token the first time the Trusted Publisher is used, and stored as part of the TP's configuration and used to validate future claims.

cc @di @woodruffw

woodruffw commented 8 months ago

Thanks @facutuesca!

(I asked @facutuesca to file this as a separate issue so that we don't lose track of it. As previously discussed, it isn't a priority for the current implementation given limitations on GitLab's side + complexity tradeoffs on PyPI's side.)

nejch commented 3 weeks ago

There is an endpoint that could potentially be used to periodically check the existence of a (private) namespace and invalidate it as soon as it's gone - https://docs.gitlab.com/ee/api/namespaces.html#get-existence-of-a-namespace as a mitigation, if it scaled well.

But like the other endpoint discussed in the previous issue, this actually requires authenticated access, even on gitlab.com, and both always return 401 otherwise.

A possible solution for this could be implementing a TOFU (Trust On First Use) scheme, where the ID of the namespace is read from the claims of the OIDC token the first time the Trusted Publisher is used, and stored as part of the TP's configuration and used to validate future claims.

Because of the restrictions above, I think that might be a better solution even if an API becomes available, especially with https://github.com/pypi/warehouse/issues/15838 in mind. GitLab instances can have their discovery endpoints publicly exposed, their APIs behind a proxy, and runners with internet access to publish to pypi.org.