renovatebot / renovate

Home of the Renovate CLI: Cross-platform Dependency Automation by Mend.io
https://mend.io/renovate
GNU Affero General Public License v3.0
17.6k stars 2.32k forks source link

Support git insteadOf env configuration in git-refs and git-tags datasources #19271

Open Okeanos opened 1 year ago

Okeanos commented 1 year ago

What would you like Renovate to be able to do?

As per the discussion here concerning "Golang Support for private pseudo host":

Renovate could improve its Go module (gomod/go.mod) support by implementing a strategy to deal with "fake" or "pseudo" names for modules.

Go module names have certain restrictions, e.g. they may not contain certain characters such as the % symbol. However, under the hood Go modules are effectively URLs to Git repositories and resolved as such. URLs generally do not have this problem and the URL https://okeanos-azure@dev.azure.com/okeanos-azure/renovate%20me/_git/renovate%20me/go-example.git is a perfectly valid URL that Git can resolve and use. However, because of the URL-encoded whitespace Go will reject it.

The workaround to this is to use an insteadOf rule within the applicable .gitconfig like so:

[url "https://okeanos-azure@dev.azure.com/okeanos-azure/renovate%20me/_git/"]
        insteadOf = https://go.nikolasgrottendieck.com/

And then referring to the go-example module like this in a go.mod file:

require go.nikolasgrottendieck.com/go-example.git v1.0.2

Optionally, the following Go environment settings can be supplied to make the process a little more enjoyable (fewer attempts from Go to do stuff that doesn't work anyway):

go env -w GONOPROXY='*.nikolasgrottendieck.com'
go env -w GONOSUMDB='*.nikolasgrottendieck.com'
go env -w GOPRIVATE='*.nikolasgrottendieck.com'

Afterwards, Go will happily pretend go.nikolasgrottendieck.com/<something> is a valid module path and let Git resolve it behind the scenes correctly. This is effectively the same way Go says to pass credentials for private repositories (something Renovate already supports), however, in this case the target and source for the insteadOf are not predictable. Microsoft says pretty much the same in their documentation as well. Based on the discussion I mentioned earlier, there does not appear to be a way to tell Renovate about this non-predictable rewrite, though.

A public example repository with this Go module setup described exists here and also contains a number of workflow files that I used to explore various ways of telling Renovate (and Go) about this rewrite. Sadly, I do not have the time to debug where exactly the different workarounds I attempted fail – however, sometimes go get is invoked and sometimes Renovate fails before that. Feel free to look at the workflow runs and in case you need to reproduce something yourself all code involved here is public and freely available to you to fork and work with as you please. If you need clarification for some parts, please let me know!

It is possible to tell Git about this rewrite in different ways so that Go can pick it up (so to speak):

Potentially related discussions and issues:

If you have any ideas on how this should be implemented, please tell us here.

As far as I can currently tell two things need to happen for this to work:

Is this a feature you are interested in implementing yourself?

No

rarkins commented 1 year ago

There is only one part to be done for this issue: the git-refs and git-tags datasources should support git insteadOf when configured in env (not file).

If further steps are required, they should be separate, follow-up issues.

Okeanos commented 1 year ago

I recently had some time and went ahead and tried to understand the code as it currently exists, especially in relation to how it handles Git environment variables.

So, from my current understanding the following happens:

  1. Go Module support is loaded via the gomod manager declaration.
  2. The GoDatasource is declared as one of two options to resolve dependencies (the other for Go the language itself is not important here) for said manager.
  3. GoDatasource declares two datasources of its own: GoDirectDatasource and GoProxyDatasource – the proxy datasource should be irrelevant for this case, I think.
  4. Besides some other (irrelevant for this use case) datasources, the BaseGoDatasource is declared as part of the GoDirectDatasource.
  5. BaseGoDatasource.getDatasource is invoked which will call BaseGoDatasource.goGetDatasource as a fallback because no other rule applies.
  6. BaseGoDatasource.goGetDatasource will immediately try to do a http.get on the goModule assuming that it is indeed the correct URL. At this point we need the rewrite the first time
  7. Some time later …
  8. updateArtifacts from artifacts.ts is invoked and will write some things to the extraEnv, e.g. the authentication settings for Git (getGitEnvironmentVariables) to be used with go get in the end. Here we need the rewrite the second time

Interestingly (obviously?), the http.get call in BaseGoDatasource.goGetDatasource as well as the getGitEnvironmentVariables invocation within the Go specific code use the hostRules declaration to inject authentication into the processes.

I am now wondering whether the actual fix for this issue is to add an additional hostRule property e.g. replaceWith, that can be used to replace the matchHost value with another value similar to the Git specific insteadOf syntax:

{
  "hostRules": [
    {
      "matchHost": "go.nikolasgrottendieck.com",
      "username": "<some-username>",
      "password": "<some-password>",
      "replaceWith": "dev.azure.com/okeanos-azure/renovate%20me/_git/renovate%20me"
    }
  ]
}

When parsing the hostRules all matches for go.nikolasgrottendieck.com would now be rewritten to call dev.azure.com/okeanos-azure/renovate%20me/_git/renovate%20me instead and use the other specified configuration values, e.g. the authentication, as well.

This would be contingent on being able to replace the matching host with more than a different host, though, as shown in my example.

rarkins commented 1 year ago

I had also thought about using a hostRules approach similar to how you described. My main concern was whether we could achieve the same complexity which git's syntax supports

szpak commented 1 year ago

I've bumped into that recently in the ArgoCD project. The dependencies to Help charts are nicely detected, but their resolve fails with Host key verification failed. There is SSH private key, however, using the access token (here GitLab) would make it possible to resolve (the used renovate user has RO access to them). Unfortunately, it is somehow problematic to change the syntax from git@ to https:// as ArgoCD internally uses SSH access to those charts.

git config --global url."https://gitlab-ci-token:${CI_JOB_TOKEN}@git.example.com/".insteadOf "git@git.example.com:"

Using insteadOf eliminated the error, but Renovate doesn't try to find available versions at all (I'm not sure why?). I believe, this feature request would help also in my case. However, @rarkins, you mentioned:

the git-refs and git-tags datasources should support git insteadOf when configured in env (not file).

Why "not file" is added at the end? It should work for git insteadOf configured in the Git config? Why there could be no try to resolve the dependencies after change in the DEBUG logs?

szpak commented 1 year ago

Using insteadOf eliminated the error, but Renovate doesn't try to find available versions at all (I'm not sure why?)

After some time reading the code of Renovate and git-js, I realized that insteadof is properly applied for ls-remote in git-tags. The problem in my case was:

In the end tags from private repositories on GitLab are taken for Helm charts in ArgoCD. Nevertheless, maybe for the original problem with Go lang, it is still valid.