r-lib / credentials

Tools for Managing SSH and Git Credentials
https://docs.ropensci.org/credentials
Other
72 stars 5 forks source link

prefer or accept token keyed by username #16

Closed maxheld83 closed 3 years ago

maxheld83 commented 3 years ago

I'm running into a problem where credentials::set_github_pat(verbose = FALSE) (I think) won't find the correct entry in my (macOS) keychain, and therefore will always ask me interactively, even though there is a PAT in the keychain.

Here's how to reproduce this (executible reprex is a bit difficult b/c of interactive nature):

  1. set a PAT on GitHub (for example via usethis::create_github_token)
  2. ingest the token to the git credentials manager via gitcreds::gitcreds_set() (choose update; I already had info in there probably because of gh cli usage?)
  3. retrieve the token from the GCM (macOS keychain) and store it as an env var via credentials::set_github_pat()

This did not retrieve the token from the GCM, but instead asked for another one:

If prompted for GitHub credentials, enter your PAT in the password field
Password for 'https://PersonalAccessToken@github.com': 

─ Session info ───────────────────────────────────────────────────────────────
 setting  value                       
 version  R version 4.1.0 (2021-05-18)
 os       macOS Big Sur 11.4          
 system   x86_64, darwin17.0          
 ui       X11                         
 language en_US.UTF-8 git             
 collate  en_US.UTF-8                 
 ctype    en_US.UTF-8                 
 tz       Europe/Berlin               
 date     2021-06-17                  

─ Packages ───────────────────────────────────────────────────────────────────
 package     * version date       lib source        
 askpass       1.1     2019-01-13 [1] CRAN (R 4.1.0)
 cli           2.5.0   2021-04-26 [1] CRAN (R 4.1.0)
 credentials   1.3.0   2020-07-21 [1] CRAN (R 4.1.0)
 openssl       1.4.4   2021-04-30 [1] CRAN (R 4.1.0)
 sessioninfo   1.1.1   2018-11-05 [1] CRAN (R 4.1.0)
 sys           3.4     2020-07-23 [1] CRAN (R 4.1.0)
 withr         2.4.2   2021-04-18 [1] CRAN (R 4.1.0)

[1] /Library/Frameworks/R.framework/Versions/4.1/Resources/library
maxheld83 commented 3 years ago

I think this is because of:

https://github.com/r-lib/credentials/blob/721bef312cc7a6fc0b77280e14b040aedd660e6c/R/github-pat.R#L20

which then gets used in looking up a credential (Password for 'https://PersonalAccessToken@github.com' in the above).

The problem seems to arise b/c:

So, an immediate fix is to declare (username needs to be replaced obvs):

in .profile (or friends):

export GITHUB_PAT_USER=maxheld83

or in .Renviron:

GITHUB_PAT_USER=maxheld83

Then everything works as expected.

maxheld83 commented 3 years ago

If this "fix" is appropriate, I'll be happy to add a PR of the docs to that effect.

Though I don't know enough about the broader context here, and I wonder whether there's a more elegant solution to infer the username. I imagine it's difficult to figure out what the appropriate username is at session-load time (.Rprofile), as tangentially discussed in https://github.com/r-lib/remotes/issues/516 and https://github.com/r-lib/remotes/issues/488

maxheld83 commented 3 years ago

it appears this "divergence" between gitcreds and credentials has been anticipated by usethis / may be expected:

Sidebar: the gitcreds package plays the same role for gh as the credentials package does for gert. Both gitcreds and credentials provide an R interface to the Git credential store, but are targeting slightly different use cases. The gitcreds and credentials packages are evolving convergently and may, in fact, merge into one. But in the meantime, there is some chance that they use a different “key”, in the “key-value” sense, when storing or retrieving your PAT. Therefore, it is conceivable that gert/credentials may also prompt you once for your PAT, in which case you should just provide it again. To explicitly check if credentials can discover your PAT, call credentials::set_github_pat(). If it cannot, this will lead to a prompt where you can enter it.

So apologies if this is noise.

jennybc commented 3 years ago

I suspect this is a keying problem. But maybe not gitcreds vs. credentials, rather (gitcreds & credentials) vs. something else?

It might be interesting to go directly into the macOS keychain and intentionally hunt down and delete all creds stored for GitHub.com.

Then re-ingest your PAT with gitcreds and see if life is good. I don't think there is currently a systematic problem with gitcreds and credentials happily finding and using the same PAT (although yes, structurally this is possible and is why I wrote that caveat that you found). I have never set the GITHUB_PAT_USER environment variable FWIW and the credentials/gert stack and gitcreds/gh/usethis stack are working well (together).

Are you trying to do anything exotic, like store PATs for multiple usernames on GitHub.com?

maxheld83 commented 3 years ago

You're right, it was a keying problem.

# remove all relevant secrets from macOS keychain
gh::gh_token()  # confirm is empty
usethis::create_github_token()  # copy the token
gitcreds::gitcreds_set()  # paste in the token
gitcreds::gitcreds_get()

yields:

<gitcreds>
  protocol: https
  host    : github.com
  username: PersonalAccessToken
  password: <-- hidden -->

This pseudo-username PersonalAccessToken is exactly the key that credentials expects:

https://github.com/r-lib/credentials/blob/721bef312cc7a6fc0b77280e14b040aedd660e6c/R/github-pat.R#L20

As expected

credentials::set_github_pat()

then also makes remotes::install_github() and friends work.

My remaining questions are:

  1. is username: PersonalAccessToken idiomatic/canonical?
  2. how did I ever get into the username: maxheld83 state, which worked with gitcreds, but didn't work with credentials?
maxheld83 commented 3 years ago

re: bad keys

how did I ever get into the username: maxheld83 state, which worked with gitcreds, but didn't work with credentials?

It seems that GitHub's gh cli (!= r-lib/gh) is the "offender" here.

To disambiguate, I'm going to refer to both programs by their repo slugs:

gh auth refresh  # this is cli/cli
Rscript -e "gitcreds::gitcreds_get()"

Seems to deposit a token to the macOS keychain named by username and yields:

<gitcreds>
  protocol: https
  host    : github.com
  username: maxheld83
  password: <-- hidden -->

Because that's not a key expected by credentials, it will mess things up.

Should be noted that this token, as evidenced by the initial strings, is not a PAT, but an oauth token for an authorised github app:

gh::gh_whoami()
#> {
#>   "name": "Max Held",
#>   "login": "maxheld83",
#>   "html_url": "https://github.com/maxheld83",
#>   "scopes": "gist, read:org, repo, workflow",
#>   "token": "gho_...NEf6"
#> }

This cli/cli-based auth will work thus:

If I now run the auth workflow in the usethis vignette on top of this setup:

usethis::create_github_token()
gitcreds::gitcreds_set()
-> Your current credentials for 'https://github.com':

  protocol: https
  host    : github.com
  username: maxheld83
  password: <-- hidden -->

-> What would you like to do? 

1: Keep these credentials
2: Replace these credentials
3: See the password / token

Selection:   

Your newly minted token would replace the oauth token deposited by cli/cli.

I haven't checked the default scopes of usethis::create_github_token and gh auth refresh in detail, both the oauth token and the PAT work with cli/cli (gh) and r-lib/gh.

Happily, everything seems to work just fine with the oauth token and this in .Rprofile:

Sys.setenv(GITHUB_PAT = gitcreds::gitcreds_get(use_cache = FALSE)$password)

Or, to appease credentials:

Sys.setenv(GITHUB_PAT_USER = "maxheld83")  # use your own username
credentials::set_github_pat()
maxheld83 commented 3 years ago

I'm a bit confused as to where this leaves us / what's at fault.

I'd guess that it makes sense to play nicely with what GitHub appears to think is the idiomatic/canonical way to deposit tokens into the macOS keychain. Since cli/cli and r-lib/gh pursue exactly the same purpose, it also makes sense that they would share scopes.

This might also be useful for people who have several hosts or usernames.

As an added bonus playing nice with username-keyed oauth tokens might future proof for a scenario where gitcreds (?) becomes an Authorised OAuth App. (I'd like that, because it gets rid of the manual copy/pasting of secrets).

So I'm wondering whether it would be nice if credentials preferred, or at least checked for username-keyed (oauth) tokens.

jennybc commented 3 years ago

I think what GitHub's cli is doing is actually non-canonical. Or at least that was the conclusion @jeroen @gaborcsardi and I reached several months ago, when we talked about exactly this at length.

As I recall, at that time, it seemed there was some semi-official standard to key a PAT with PersonalAccessToken and to NOT key a PAT with actual username. One reason for this is to NOT clobber a username/password that a user has stored. Now it's true that GitHub is well on its way to deprecating username/password access anyway, so in their mind this clobbering with a different token is probably a non-issue or maybe even viewed as desirable.

maxheld83 commented 3 years ago

One reason for this is to NOT clobber a username/password that a user has stored

I was worried about this too -- but on macOS at least, cli/cli deposits as what macOS keychain reports as "Internet password", as opposed to "Web form password" for the password you'd enter at github.com so they don't clash. So happily, I still have all my credentials 😅 for now.

maxheld83 commented 3 years ago

Anyway, please close this if it's unproductive -- I see there's been a lot more context/discussion around this.

maxheld83 commented 3 years ago

uh the more I think about it the more confused I am 🤔.

👍 On the hand, it seems like a good idea to always rely on the same auth mechanism in various API wrappers and git use.

👎 On the other hand, (inadvertently) piggybacking on the cli/cli token will mush up scopes, quotas between different uses and make this github.com pretty meaningless:

gh

(To be clear calls via credentials won't currently be subsumed there, but might be when the env var is GITHUB_PAT_USER is set or this issue otherwise resolved).

gaborcsardi commented 3 years ago

👎 On the other hand, (inadvertently) piggybacking on the cli/cli token will mush up scopes, quotas between different uses and make this github.com pretty meaningless:

I think it is pretty rare that you'd want a different token for each tool, if you use them to access the same git repositories, from the same machine, for interactive development.

We cannot do much about what the GH cli does and how it names its tokens. Naming them "GitHub CLI" probably makes sense for beginners, but it is not entirely accurate, since it is just a generic git(hub) token. (E.g. command line git will use the same token.)

If by quotas you mean rate limits, those are per user, not per token, so it does not matter which token you use.

What are you trying to achieve?

maxheld83 commented 3 years ago

If by quotas you mean rate limits, those are per user, not per token, so it does not matter which token you use.

yes, my bad.

What are you trying to achieve?

Sorry, got a little sidetracked here.

If sharing tokens across wrappers is considered appropriate practice, I'd like credentials to accept a username-keyed token (deposited e.g. by GH cli) much like gitcreds already does. Might just remove a small bit of friction.

Perhaps set_github_pat() could just check for a username keyed token, if it can't find one otherwise.

jennybc commented 3 years ago

I think "we" are bound to experience pain if multiple tools are essentially fighting over who manages the PAT. We've managed to make credentials and gitcreds quite compatible.

But there are lots of usethis functions that only work with very specific scopes, which are the defaults for usethis::create_github_token().

So it could be very annoying if someone sets the PAT up the usethis way, experiences usethis success, etc. Then uses the GitHub cli, silently has the original PAT clobbered, then inexplicably can't do those usethis operations anymore.

So I think this is more a question of whether it's realistic to harden our tools to arbitrary credential behaviour by other tools.

gaborcsardi commented 3 years ago

I think this is probably a documentation issue. The GH CLI does not silently overwrite an existing token, it asks you if you want to use an existing token or set up OAuth. It also tells you if a scope is missing from an existing token.

If a scope happens to be missing for the GH cli, or for usethis, people can edit the token and add the missing scope. usethis could add the scope that the GH cli needs to the default, I think that would be reasonable.

IDK if we need to support OAuth tokens better. AFAICT gitcreds is fine with them, and GH seems to work as well, but I haven't checked thoroughly.

jennybc commented 3 years ago

The GH CLI does not silently overwrite an existing token, it asks you if you want to use an existing token or set up OAuth. It also tells you if a scope is missing from an existing token.

Ah, so the GH CLI could make use of an existing PAT in the credential store?

usethis could add the scope that the GH cli needs to the default, I think that would be reasonable.

Does anyone know already that the default scopes selected in usethis::create_github_token() are NOT sufficient for the GH cli (or whatever it regards as typical/default)?

I think the biggest problem is users being confused about which token is being used by which tool.

gaborcsardi commented 3 years ago

Ah, so the GH CLI could make use of an existing PAT in the credential store?

Yeah. Btw. you can also choose not to use the same token as git.

❯ gh auth login
? What account do you want to log into? GitHub.com
? You're already logged into github.com. Do you want to re-authenticate? Yes
? What is your preferred protocol for Git operations? HTTPS
? Authenticate Git with your GitHub credentials? Yes
? How would you like to authenticate GitHub CLI? Paste an authentication token
Tip: you can generate a Personal Access Token here https://github.com/settings/tokens
The minimum required scopes are 'repo', 'read:org', 'workflow'.
? Paste your authentication token:

So it is repo, read:org and workflow. usethis seems to to have repo, user, gist, workflow.

So we would also need read:org it seems. (It is weird to me that usethis does not need this, but that's probably just the weirdness of these scopes.)

It would be better if the GH cli could detect that there is a token already, they probably don't know about the fake credential helper trick, or just don't want to do it.

gaborcsardi commented 3 years ago

I think the biggest problem is users being confused about which token is being used by which tool.

IDK, it does not seem too bad to me. Either they opt to use the same credentials as command line git or they use their own credentials. That's pretty much it. With the GH cli, you can choose, apparently.

maxheld83 commented 3 years ago

Ah, so the GH CLI could make use of an existing PAT in the credential store?

In my (limited) experience with gh auth *, it only used an existing PAT if it was keyed by username, as neither gitcreds nor credentials will do. If there was an existing PAT (even if its a PAT not an oauth token), it would use that.

So I'd conjecture:

  1. gitcreds::gitcreds_set() / credentials::git_credential_update() -> gh auth login no problemo, because gitcreds/credentials use PersonalAccessToken as a key, and gh uses username.
  2. 🤔 gh auth login -> gitcreds::gitcreds_set() / credentials::git_credential_update() Potentially unexpected complications, because gitcreds::gitcreds_set() can overwrite the gh oauth token.
maxheld83 commented 3 years ago

btw, with GitHub's new token prefixes, it might be possible catch some unexpected clobbering / confusion:

  1. gitcreds/credentials could warn on, or reject using any tokens which !startsWith(x, 'ghp').
  2. gitcreds/credentials could warn on, or reject overwriting any tokens which !startsWith(x, 'ghp').

That way, at least gitcreds/credentials would never futz with the gho_* OAuth access tokens deposited by gh.

Or maybe that'd just add more confusion & friction 🤷?

gaborcsardi commented 3 years ago

Yes, these are good initial observations, but I am afraid that reality is more complicated and thus they are not completely accurate.

because gitcreds/credentials use PersonalAccessToken as a key

I can't speak for credentials, but gitcreds does not use a username in the credential queries, unless the user configures one, e.g. in the url: https://gaborcsardi@github.com. For example, with the default osxkeychain credential helper on macOS, gitcreds::gitcreds_get() returns the first token that it can find. The username does not matter. Command line git is the same, gitcreds just does what command line git does.

But other credential helpers might be different, see this if you must: https://github.com/r-lib/gitcreds/blob/master/vignettes/helper-survey.Rmd

When gitcreds sets a token, then it uses PersonalAccessToken as the user name, because the git credential system does not work without a username, but only if you don't set a username, either in the URL, or via the git config, or via an existing token. So when gitcreds_set() overwrites a credential, it will use its username, unless there was a different one in the URL or config.

So in summary, there is not magical order of commands to create two sets of credentials that will both be used by one tool each. And there should not be two sets of credentials, really. Both tools aim to use the same credentials as command line git, so let them just do that.

OTOH if you do not want to use the same credentials as git in the GH CLI, I believe you can do that by this choice:

? Authenticate Git with your GitHub credentials? No
  1. gitcreds/credentials could warn on, or reject using any tokens which !startsWith(x, 'ghp').
  2. gitcreds/credentials could warn on, or reject overwriting any tokens which !startsWith(x, 'ghp').

No, gitcreds is completely agnostic about the authentication mechanism, it is not meant to be a tool for GitHub PATs only, so it is not going to do these things. Upstream packages might.

jennybc commented 3 years ago

I'm not convinced there's anything credentials or gitcreds can do that is clearly a net positive here. Other than maybe documenting that, if surprising things are happening, consider whether another app is also messing with your stored Git credentials.

maxheld83 commented 3 years ago

thanks for being so patient with my somewhat ill-defined issue here.

Other than maybe documenting that, if surprising things are happening, consider whether another app is also messing with your stored Git credentials.

Please reopen or ping me if I can contribute by adding this to the documentation (gitcreds? usethis vignette?)