Document how to use with Git over HTTP

bittner commented 3 years ago

Sometimes you can't use Git over SSH in a project, hence you'll use HTTP (e.g. due to network constraints). I'm struggling to get this working, and I can't see any related information in our README.

My modulesync.yml looks like this:

---
git_base: https://gitlab.example.com/
branch: managed-update

The repository is private, hence I need to authenticate. It's not 100% clear how to do that properly (not taking hardcoding https://username:password@gitlab.example.com/ into consideration). The error I get is this one:

Syncing some/project
Cloning repository fresh
Cloning from https://gitlab.example.com/some/project.git
Error while updating some/project
/usr/local/bundle/gems/git-1.7.0/lib/git/lib.rb:989:in `command': git '-c' 'color.ui=false' clone '--' 'https://gitlab.example.com/some/project.git' 'modules/some/project'  2>&1:Cloning into 'modules/some/project'... (Git::GitExecuteError)
fatal: could not read Username for 'https://gitlab.example.com': No such device or address
            from /usr/local/bundle/gems/git-1.7.0/lib/git/lib.rb:78:in `clone'
            from /usr/local/bundle/gems/git-1.7.0/lib/git/base.rb:29:in `clone'
            from /usr/local/bundle/gems/git-1.7.0/lib/git.rb:98:in `clone'
            from /usr/local/bundle/gems/modulesync-2.0.0/lib/modulesync/git.rb:58:in `pull'
            from /usr/local/bundle/gems/modulesync-2.0.0/lib/modulesync.rb:115:in `manage_module'
            from /usr/local/bundle/gems/modulesync-2.0.0/lib/modulesync.rb:176:in `block in update'
            from /usr/local/bundle/gems/modulesync-2.0.0/lib/modulesync.rb:173:in `each'
            from /usr/local/bundle/gems/modulesync-2.0.0/lib/modulesync.rb:173:in `update'
            from /usr/local/bundle/gems/modulesync-2.0.0/lib/modulesync/cli.rb:142:in `update'
            from /usr/local/bundle/gems/thor-1.0.1/lib/thor/command.rb:27:in `run'
            from /usr/local/bundle/gems/thor-1.0.1/lib/thor/invocation.rb:127:in `invoke_command'
            from /usr/local/bundle/gems/thor-1.0.1/lib/thor.rb:392:in `dispatch'
            from /usr/local/bundle/gems/thor-1.0.1/lib/thor/base.rb:485:in `start'
            from /usr/local/bundle/gems/modulesync-2.0.0/bin/msync:8:in `<top (required)>'
            from /usr/local/bundle/bin/msync:23:in `load'
            from /usr/local/bundle/bin/msync:23:in `<main>'

Just for the sake of completeness, the setup uses the vshn/modulesync Docker image and roughly this CI configuration on GitLab:

update_repository:
  image: vshn/modulesync:latest
  variables:
    GITLAB_BASE_URL: ${CI_SERVER_URL}
    GITLAB_TOKEN: ${SECRET_TOKEN}
    GIT_COMMITTER_NAME: Edda Example
    GIT_COMMITTER_EMAIL: edda@example.com
  script: msync --pr --pr-labels=managed-update --force

Note that the GITLAB_BASE_URL ad GITLAB_TOKEN is really just for creating the merge request (via GitLab's API).

Document how this shall be done

Using Git over HTTP is probably possible as an alternative.

If we want to suggest using a token for the normal Git operation we should explain how to do this in the README.

bittner commented 3 years ago

Turns out it's currently possible to use Git over HTTP only via setting the git_base in modulesync.yml, e.g. for GitLab:

git_base: https://oauth2:${ACCESS_TOKEN}@${CI_SERVER_HOST}/

If you put this under version control and have a pipeline run msync it probably makes sense to replace the above variables dynamically, instead of hard-coding them, e.g. using sed or envsubst:

- envsubst < modulesync.yml | tee modulesync.yml

EDIT Alternatively, you can use msync's --git-base command line option.

Background reading:

GitLab: Authenticate Using Access Token (blog)
GitLab: Predefined variables (official docs)

ekohl commented 3 years ago

Another trick to remember is that Git can transparently replace URLs for you. For example, to replace

$ git config url."https://".insteadOf git://
$ grep url .git/config 
    url = git://github.com/voxpupuli/modulesync
$ git remote -v
origin  https://github.com/voxpupuli/modulesync (fetch)
origin  https://github.com/voxpupuli/modulesync (push)

So I have some aliases in my ~/.gitconfig:

[url "https://github.com/"]
        insteadOf = "gh:"
[url "git@github.com:"]
        pushInsteadOf = "gh:"
[url "https://github.com/"]
        insteadOf = "git://github.com/"
[url "git@github.com:"]
        pushInsteadOf = "https://github.com/"

This also means I can do this:

$ git clone gh:voxpupuli/modulesync
Cloning into 'modulesync'...
remote: Enumerating objects: 55, done.
remote: Counting objects: 100% (55/55), done.
remote: Compressing objects: 100% (37/37), done.
remote: Total 1764 (delta 25), reused 38 (delta 17), pack-reused 1709
Receiving objects: 100% (1764/1764), 389.47 KiB | 5.49 MiB/s, done.
Resolving deltas: 100% (928/928), done.
$ cd modulesync/
$ git remote -v
origin  https://github.com/voxpupuli/modulesync (fetch)
origin  git@github.com:voxpupuli/modulesync (push)

As for the username/password, git credentials is probably a better solution.

I think this isn't documented in this project since it's really built into git already.

bittner commented 3 years ago

Thanks for the hints, @ekohl. That's certainly nice to know. But this isn't really just about knowing Git.

ModuleSync, that's my point, currently neither gives a helpful hint when Git fails to authenticate (see above) nor does it offer a safe and convenient way to inject credentials when you need to use Git over HTTP (note the example above that shows that we have to manipulate a configuration file). And even worse, the README mentions absolutely zero about such a case (likely because ModuleSync is rarely used with Git over HTTP with private repos).

Ideally, msync should:

offer a --base-url CLI option to allow passing in the base_url value dynamically,
handle the specific error situation gracefully, suggesting to use dedicated CLI options (e.g. --username, --password) or --base-url with username and password in the URL, or use one of the Git-based solutions you suggested above.

ekohl commented 3 years ago

offer a --base-url CLI option to allow passing in the base_url value dynamically,

It does but it's called --git-base. It's used here: https://github.com/voxpupuli/modulesync_config/blob/e044c5f6fe5cdb632f43601067e31b816f4b3edc/.github/workflows/main.yml#L19

handle the specific error situation gracefully, suggesting to use dedicated CLI options (e.g. --username, --password) or --base-url with username and password in the URL, or use one of the Git-based solutions you suggested above.

I don't think it should. If you want to, specify it in the URL or use git-based solutions. It's the unix philosophy: small tools that do their own thing. IMHO modulesync should do less with git, not more. Personally I use modulesync most often with update --offline and perform git actions in a shell loop.

bittner commented 3 years ago

It does but it's called --git-base.

Oh yeah, that's true. My mistake! (I corrected the wrong references in the comments above). That works, nice!

What I found missing is an environment variable as an alternative to --git-base. Maybe we can add that similar to what we do for PRs.

voxpupuli / modulesync

Document how to use with Git over HTTP #210

Document how this shall be done