sourcegraph / zoekt

Fast trigram based code search
Apache License 2.0
728 stars 83 forks source link

how to handle git credentials for git ops #578

Open xvandish opened 1 year ago

xvandish commented 1 year ago

Hello - I ran into this while deploying the indexserver on a machine yesterday.

While zoekt-mirror-* have credential handlers that look for tokens or usernames for calling the API's that list either the repos in an org/user/etc, the rest of the git calls (fetch, clone, etc) have no authentication wrappers.

I'm wondering how anyone who runs zoet-indexserver on a server to mirror private repos generally handles this - I know it'll depend on the codehost, but generally - do you all write the git credentials to a file? use the git-credentials store? In my case, I'm specifically using GitHub as the host, and am using a PAT with read scopes.

Right now I think leaning towards writing a github token as the password to the git-credentials store.

livegrep gets around this by using an environment variable for the github token, then passing that env variable through a pipe to git (through a custom askpass script) so it never touches the file system, but introducing something like that would probably mean abstracting all git calls into something like callGit(args []string, username, password string) and then that function handles credentials if username and password aren't empty. I think that'd be a pretty involved change, and I'm not sure how many here would use it, so really not leaning that direction atm.

Thoughts?

gl-srgr commented 1 year ago

Hello @xvandish, with regards to the github code host specifically since that is what you're mirroring: I believe the expectation in zoekt-mirror-github is to read the token from a .github-token file so that would be the quickest way to get you setup.

If the token is read from an environment variable then I believe we could avoid abstracting the git calls as you described because we'd still instantiate our git client the same as we do now. So this could be an option to add in the future.

As for git credentials store I don't believe we use it for zoekt-indexserver (or zoekt-sourcegraph-indexserver), but since git-credentials store the passwords in a plain text file it's not that different from the .github-token file approach. We could run commands to read the cred store for a given url and use that for our client, but that would also need to be added to zoekt-mirror-github's main() startup logic.

xvandish commented 1 year ago

Hello, thanks for taking a look!

To clarify the problem: the problem isn't zoekt-mirror-github, precisely because it uses .github-token, (which is what I'm using) and so works just fine for me.

The problem is the rest of the git calls made by the indexserver (or, originating from commands that the indexserver/mirror-github makes) fail out when reaching private repos, since they don't attempt to set credentials. For example:

git fetch (via periodicFetch) https://github.com/sourcegraph/zoekt/blob/b247fb51dece2d026582f7870a3449edff0f8500/cmd/zoekt-indexserver/main.go#L132

git clone (via zoekt-mirror-github -> gitindex.CloneRepo https://github.com/sourcegraph/zoekt/blob/main/gitindex/clone.go#L57-L59

And.. I think that's really it? For the moment I've gotten around this by just doing a global level git config --global url.https://${GITHUB_KEY}:x-oauth-basic@github.com/.insteadof https://github.com/ as part of the container start up and its working fine - so this isn't at all high priority.

I noticed the sourcegraph indexserver makes some git calls (like fetch) but does no apparent authentication - https://sourcegraph.com/github.com/sourcegraph/zoekt/-/blob/cmd/zoekt-sourcegraph-indexserver/index.go?L214-222. I'm guessing the container it's running in gets a github token injected somwhere into the git config.. Does that track? If that's how you all are doing it, happy to keep using my hacky git config --global method.