gitkraken / vscode-gitlens

Supercharge Git inside VS Code and unlock untapped knowledge within each repository — Visualize code authorship at a glance via Git blame annotations and CodeLens, seamlessly navigate and explore Git repositories, gain valuable insights via rich visualizations and powerful comparison commands, and so much more
http://gitkraken.com/gitlens
Other
8.98k stars 1.33k forks source link

Support Git remote providers running behind a reverse proxy #1964

Open adambadura opened 2 years ago

adambadura commented 2 years ago

In our use case, we have Gerrit set up with the main (push) server being on gerrit.ext.net.nokia.com, however, Gitiles are on https://gerrit.ext.net.nokia.com/gerrit/ rather than https://gerrit.ext.net.nokia.com/. It seems there is no way to configure this. While without proper configuration for example links from Change-Id: in the commit message don't work since they use https://gerrit.ext.net.nokia.com/q/<change-id> instead of https://gerrit.ext.net.nokia.com/gerrit/q/<change-id>.

For a more detailed investigation and analysis see #1954 and the comment by @felipecrs in particular.

felipecrs commented 2 years ago

Just some small clarifications:

As it can be seemed, it's not possible to infer the /gerrit prefix when someone has cloned the project using SSH.

My suggested solution is to have a new option in the remote provider configuration, like:

{
  "gitlens.remotes": [
    {"domain": "gerrit.ext.net.nokia.com", "type": "Gerrit", "pathPrefix": "gerrit/"},
  ],
}

Which, if set, will be used to build the final URLs to be used by Gitlens.

I believe this could be applied to most providers, and the reason why this was never brought up before, it's because this is not a so common setup.

So, before me or anyone jump to the implementation, it would be nice if we could agree on the proposed design and if it should be extended for the other providers besides Gerrit.

adambadura commented 2 years ago

Why only a "pathPrefix"? Why not provide possibility to specify domain/URL for each individual type of operation? Someone might come up with a setup where push is done to the main server but both fetch and query are done on secondary servers.

felipecrs commented 2 years ago

but both fetch and query are done on secondary servers.

I don't think this is realistic... as I mentioned at some point, in my company we also have such a master/mirror setup. We should fetch from mirror and push to master, but any HTTPS operation (like querying) should be done against master.

adambadura commented 2 years ago

Still, I consider the benefit of providing only the pathPrefix to be smaller than the flexibility of a full URL. However, if the "framework" allows it perhaps a more flexible configuration could be made. Like:

{"domain": "gerrit.ext.net.nokia.com", "type": "Gerrit", "query": {"pathPrefix": "gerrit/"}}

or

{"domain": "gerrit.ext.net.nokia.com", "type": "Gerrit", "query": {"url": "why.not.gerrit.ext.net.nokia.com/gerrit/"}}
felipecrs commented 2 years ago

I always prefer to give users less things to tinker with rather than providing more options unless the options are really needed. Which hasn't been the case so far.

But anyway, that's only my opinion.

felipecrs commented 2 years ago

By the way, if you enter in the clone page of any project, could you please confirm that the URL matches the following format when using HTTPS protocol?

git clone https://<user>@gerrit.ext.net.nokia.com/gerrit/a/<project>
adambadura commented 2 years ago

I have the following options:

So, I confirm.

felipecrs commented 2 years ago

The importance of this is that Gitlens will read this information to detect which Remote Provider type to use. In your case, the only missing thing is the /gerrit prefix, because you cloned over SSH.

And even for the HTTPS clone URLs, it would be impossible for Gitlens to distinguish whether /gerrit is part of the project name or not.

That's why prefix must be somehow informed by the user in the settings. Anyway, this does not change anything. I just can't really think of a use case for being more granular than that.

eamodio commented 2 years ago

Without diving into it too deeply (which I can't right now), I'm a bit hesitant to add another remotes configuration option, unless there was a good use-case that this would reduce complexity/configuration for users across more remotes (not just Gerrit).

At the same time GitLens today allows for the configuration of custom remotes where you can provide the specific urls required for each use-case via the urls property of the gitlens.remotes setting. While verbose, does that not provide the flexibility required here?

felipecrs commented 2 years ago

At the same time GitLens today allows for the configuration of custom remotes where you can provide the specific urls required for each use-case via the urls property of the gitlens.remotes setting. While verbose, does that not provide the flexibility required here?

That would be the first thing to think, indeed. Taking the simplification and verbosity of configuration apart, the drawback of using Custom rather than Gerrit is that it would not render clickable-links to the Gerrit changes from the commit message Change-Id: footer.

Without diving into it too deeply (which I can't right now), I'm a bit hesitant to add another remotes configuration option, unless there was a good use-case that this would reduce complexity/configuration for users across more remotes (not just Gerrit).

It seems that it's also possible to run GitLab through a reverse proxy, turning down to a configuration like the one which @adambadura have.

https://serengetitech.com/tech/proxy-in-reverse-nginx-gitlab-jira-jenkins/

One could setup GitLab's HTTP interface at:

https://my-domain.com/gitlab/

While SSH operations to fetch/push could be done against:

ssh://my-domain.com:2323/<repository>

Where 2323 is GitLab's SSH port.

In this situation, the prefix option would be required for Gitlens to correctly compute the right Gitlab URLs. One would set the Gitlens remote as:

{
  "gitlens.remotes": [
    {"domain": "my-domain.com", "type": "Gitlab", "pathPrefix": "gitlab/"},
  ],
}
felipecrs commented 2 years ago

Okay, I think we got a perfect and tangible example:

The RDO Project: https://review.rdoproject.org/r/

When cloning over SSH:

image

$ git remote get-url origin
ssh://felipecrs@review.rdoproject.org:29418/centos-opstools/kibana

We don't have the prefix. But the Gerrit base URL has the /r prefix. Example:

https://review.rdoproject.org/r/plugins/gitiles/centos-opstools/kibana

Therefore, it would also make use of the pathPrefix:

{
  "gitlens.remotes": [
    {"domain": "review.rdoproject.org", "type": "Gerrit", "pathPrefix": "r/"},
  ],
}

The RDO Gerrit was bootstrapped using Software Factory, which is a tool to simplify the deployment of many developer-related tools, like Gerrit, Zuul CI, Kibana and others under a single domain, working with reverse proxies.

And there are countless Gerrits over there deployed with Software Factory, especially counting the ones not publicly available.

eamodio commented 2 years ago

Damn, I was thinking I allowed the urls to override the pre-configured ones of a specific provider (so you wouldn't have to use "Custom") and lose the built-in autolinking. Maybe that would be a reasonable path, to allow for override support.

felipecrs commented 2 years ago

So, urls would need to accept different attributes depending on the provider (because none of the existing ones for custom helps in Gerrit). Example:

Custom provider

{
  "gitlens.remotes": [
    {
      "domain": "git.corporate-url.com",
      "type": "Custom",
      "name": "My Company",
      "protocol": "https",
      "urls": {
        "repository": "https://git.corporate-url.com/${repo}",
        "branches": "https://git.corporate-url.com/${repo}/branches",
        "branch": "https://git.corporate-url.com/${repo}/commits/${branch}",
        "commit": "https://git.corporate-url.com/${repo}/commit/${id}",
        "file": "https://git.corporate-url.com/${repo}?path=${file}${line}",
        "fileInBranch": "https://git.corporate-url.com/${repo}/blob/${branch}/${file}${line}",
        "fileInCommit": "https://git.corporate-url.com/${repo}/blob/${id}/${file}${line}",
        "fileLine": "#L${line}",
        "fileRange": "#L${start}-L${end}"
      }
    }
  ]
}

No changes, as today.

Gerrit provider

{
  "gitlens.remotes": [
    {
      "domain": "review.corporate-url.com",
      "type": "Gerrit",
      "urls": {
        "base": "https://review.corporate-url.com/gerrit"
        // PS: I haven't found a need for more than this yet
      }
    }
  ]
}

WDYT?

eamodio commented 2 years ago

yeah, it has definitely been requested to have a separate base url (one that could also specify a port) -- so that would squash some other bugs/feature requests.

This new request https://github.com/gitkraken/vscode-gitlens/issues/1943 is basically asking for this too. /cc @oliverdunk

felipecrs commented 2 years ago

@oliverdunk, just out of curiosity, are you using Gerrit as well or something else?

oliverdunk commented 2 years ago

@oliverdunk, just out of curiosity, are you using Gerrit as well or something else?

Hey @felipecrs / @eamodio 👋 I'm using a private GitLab instance - both the Web UI and Git remote are GitLab based but the different components of GitLab are split between different domains.

felipecrs commented 2 years ago

I suppose this suggestion:

{
  "gitlens.remotes": [
    {
      "domain": "my.corporate-url.com",
      "type": "Gitlab",
      "urls": {
        "base": "https://my.corporate-url.com/gitlab"
      }
    }
  ]
}

Would do the trick for you then.

oliverdunk commented 2 years ago

I suppose this suggestion:

{
  "gitlens.remotes": [
    {
      "domain": "my.corporate-url.com",
      "type": "Gitlab",
      "urls": {
        "base": "https://my.corporate-url.com/gitlab"
      }
    }
  ]
}

Would do the trick for you then.

Yep, that would work perfectly. It almost feels like it should be a top level key because it controls all of the URLs, rather than just being an individual one. But I could be convinced either way.

As I mentioned in #1943 I'd be happy to open a PR for this if it helps make it happen. But if you have plans here then don't feel a need to save it for me, happy for you to take it too. I'd just love to see this ship so I can commit a config to the repo and stop opening URLs with GitLens and then manually tweaking the URL in my browser's address bar.

Sounds like @adambadura would like it too 😀

felipecrs commented 2 years ago

Just to update, one major complication is that the protocol (https/ssh) from the git remote URL is not currently exposed to the Remote Provider constructor. That needs to be changed for the url.base to work well, because:

Imagine you have a local repo like this:

$ git remote get-url origin
https://my.corporate-url.com/gitlab/my/project

Or another, like this:

$ git remote get-url origin
ssh://my.corporate-url.com/my/project

In which both points to the same remote project.

And a configuration like this:

{
  "gitlens.remotes": [
    {
      "domain": "my.corporate-url.com",
      "type": "Gitlab",
      "urls": {
        "base": "https://my.corporate-url.com/gitlab"
      }
    }
  ]
}

Gitlens could easily take the path from the SSH URL and append that to url.base. But from the HTTPS URL, a simple append would result in something like:

https://my.corporate-url.com/gitlab/gitlab/my/project

Which is ofc wrong. Therefore, the Remote Provider classes needs to be aware of the git remote protocol, so it can properly subtract the path prefix from the url.base from the git remote URL in case the protocols match.

felipecrs commented 2 years ago

The implementation of it will take me a while. I have started, but I have no forecast to finish yet.

@oliverdunk feel free to start implementing it as well, if you are willing to. I just suggest you have this acknowledged https://github.com/gitkraken/vscode-gitlens/issues/1964#issuecomment-1100339957, because it's an easy thing to forget.

@eamodio, in case you were waiting for this before releasing a new version of Gitlens, I suggest you don't. That's because the support for vanilla Gerrit introduced earlier already satisfices several (reported) use cases.

oliverdunk commented 2 years ago

I started poking here and getting a feel for what would need to change/how hard this is.

Being able to open URLs on a totally different domain to the remote (#1943) is fairly easy. We update RemoteProvider to take the array of URLs from the config and then update the getter for baseUrl to use that if it's available. Sorted!

For the other side of the request...

Looking at how protocols work, it actually seems like the fact that the "Open branch, open commit etc." features open a HTTPS URL regardless of the remote might actually be an "accident". From what I can tell, RemoteProviderFactory never passes a protocol to the RemoteProvider constructor. Consequently, the provider defaults to HTTPS, and all of the functions that depend on baseUrl work.

With that in mind, I'm unsure what the intended way of tackling multiple protocols in GitLens internals is. If we start passing in the right protocol that would break baseURL and we would need to get the default base another way.

Regardless of the fix there, we would still need to implement a solution for https://github.com/gitkraken/vscode-gitlens/issues/1964#issuecomment-1100339957. The proposed solution of saying "if the protocols match, try to remove the path of one URL from the other" would probably work but feels a little hacky. It also doesn't support the SSH url being on /path_a/group_name/repo_name and the Git URL being /path_b/group_name/repo_name.

A better solution may be having some sort of pathMatcher, where you can pass a RegEx and select the bit of the URL that represents the repo name?

Any thoughts @eamodio? I feel bad tackling just my request but addressing this issue too definitely adds some complexity. If you have thoughts on a way we can do it nicely I'm happy to try and take on both - otherwise maybe we tackle these separately so one doesn't block the other?

felipecrs commented 2 years ago

It also doesn't support the SSH url being on /path_a/group_name/repo_name and the Git URL being /path_b/group_name/repo_name.

I believe this isn't possible to happen. I would claim that SSH urls never have a path prefix, while an HTTP url could have.

cloph commented 4 weeks ago

my +1 for a way to use a different base/reviewURL for gerrit - we have generic/canonical git.example.com/repo clone URLs (and also mirrors on other servies), but primary gerrit is at gerrit.example.com - similarly the ssh-URL also wouldn't reliably work since frequently ssh-aliases are used that then also don't match the webdomain.