sigstore / gitsign

Keyless Git signing using Sigstore
Other
932 stars 62 forks source link

Exposes private email address from GitHub #65

Open leighmcculloch opened 2 years ago

leighmcculloch commented 2 years ago

Description

When authenticating with GitHub, gitsign appears to retrieve a users private email address and insert that address into the commit being signed. It does this, even if a user has configured their GitHub account to keep their email address private, and even if the user has configured their GitHub account to reject commits that contain their private email.

It's unclear why gitsign needs to expose a users primary email address configured in GitHub. It could instead expose the GitHub username, or the GitHub user id email address, which in my case is 351529+leighmcculloch@users.noreply.github.com.

It would be better privacy, and less surprising, if gitsign didn't pull information like email addresses into the commit.

For example, this is the commit created by using GitHub as the auth, where the asterisks are my private email address:

commit 10bde749ceef2544aab0c6d8d73db66e079307f0
tlog index: 2619545
smimesign: Signature made using certificate ID 0xad76a5fc3b3318f5d356e899c6b57f4fcea6aa24 | CN=sigstore-int
smimesign: Good signature from [********************]
Parsed Git signature: true
Validated Git signature: true
Located Rekor entry: true
Validated Rekor entry: true
Author:     Leigh McCulloch <351529+leighmcculloch@users.noreply.github.com>
AuthorDate: Thu Jun 9 16:25:37 2022 -0700
Commit:     Leigh McCulloch <351529+leighmcculloch@users.noreply.github.com>
CommitDate: Thu Jun 9 16:25:37 2022 -0700

Version

❯ gitsign --version
unknown

When I install gitsign using go install it installed from the v0.1.0 tag.

dlorenc commented 2 years ago

It would be better privacy, and less surprising, if gitsign didn't pull information like email addresses into the commit.

I agree. The email gets returned from GitHub during the oauth dance when you click "Login with GitHub". We should see if there's a way to get the privacy-preserving one instead. Github doesn't actually implement OIDC fully so we already have a few special cases there.

leighmcculloch commented 2 years ago

I don't think the privacy preserving email addresses actually show up explicitly in the GitHub API. But, the user preserving email can be constructed for any GitHub username using the user endpoint.

https://docs.github.com/en/account-and-profile/setting-up-and-managing-your-personal-account-on-github/managing-email-preferences/setting-your-commit-email-address#about-commit-email-addresses

For example: https://api.github.com/users/leighmcculloch

{
  "login": "leighmcculloch",
  "id": 351529,
  ...
}

And can then be formed with:

{id}+{login}@users.noreply.github.com
dlorenc commented 2 years ago

Oh wow, thank you for the pointers!

leighmcculloch commented 2 years ago

It's worth noting that users cannot receive emails at these addresses, the addresses are used for identification only. When I commit with 351529+leighmcculloch@users.noreply.github.com, GitHub assigns the commit to my user. If Sigstore only uses the email address for identification, this seems like a good fit.

dlorenc commented 2 years ago

Yep, just for stable identification and non-repudiation.

wlynch commented 2 years ago

As @dlorenc mentioned - this behavior is currently coming from the OIDC token being generated by Dex when we bounce users through the login flow.

I was chatting with @haydentherapper a few weeks ago about how we could incorporate the GitHub user ID into the fulcio flow so that we don't need to expose user emails. I was pointed towards the username identity adaptor which looks promising, and if we can combine this with Dex's federated_claims.user_id for stability (since github usernames can change) this may give us what we need to avoid user emails altogether without needing to special-case behavior for GitHub.

leighmcculloch commented 3 months ago

Stumbled on this again today and it occurred to me that in many ways this issue is not specific to GitHub. I used the gitsign keyless flow with a Google Account login, and my Google address was inserted into the commit too. I understand some unchanging identity is required here, and probably the email address for the Google Account makes a lot of sense too. My concern though is that there was no communication to me as the user that my personal information (email address) would be inserted into a public place.

In addition to the changes suggested above for continuing to protect the GitHub private email address, which I still think is important, the keyless flow for all logins could clearly communicate that the email address will be exposed by displaying that to the user either on the command line or in the browser: "Email address email@example.org will be used as the identity and visible in the commit."

haydentherapper commented 3 months ago

We include a message like that in Cosign - https://github.com/sigstore/cosign/blob/main/cmd/cosign/cli/sign/privacy/privacy.go#L22

leighmcculloch commented 3 months ago

Something like that would be beneficial, but even that message is rather generic and easy to breeze over. It'd be helpful if email addresses are the primary personal information being embedded if the data being included is explicitly included in the message.

haydentherapper commented 3 months ago

Long term, we are looking into hashing all emails as well, as that maintains verifiability while removing emails from the log.