Open robbiemcmichael opened 12 months ago
This issue is currently awaiting triage.
SIG CLI takes a lead on issue triage for this repo, but any Kubernetes member can accept issues by applying the triage/accepted
label.
The triage/accepted
label can be added by org members by writing /triage accepted
in a comment.
There's a long debate in https://github.com/kubernetes-sigs/kustomize/pull/4334#pullrequestreview-829283639 which lends some support to the code being overly complicated in that section. I think that the lack of a way to tell the difference between the two protocols for remote resources is one of the things that makes this code overly complicated.
A better example of the code being pretty insane is in this section:
If a git repo is private, then GitHub will return a 404. Since Kustomize tries to optimistically fetch via HTTP first, you would think that the 404 would cause it to return an error even before it gets to the part where it does a git fetch
, but you would be wrong.
The test for whether the error is a load.ErrHTTP
error relies on detecting the wrapped error at the end of this block:
That part is fine, but right before the error is wrapped there's a test for whether the URL could be a git repo. If it is a git repo, it returns a separate error rather than wrapping the load.ErrHTTP
error which drops the original HTTP error from the return. This bug "luckily" results in the correct behaviour of ignoring the 404 for a failed HTTP fetch of a private repo and then it goes on to try a git fetch
instead.
But look at the code that checks whether the URL could be a git repo:
Pretty much any URL will satisfy this, for example https://example.com/does/not/exist
is considered to be a valid git repo when we have no idea what might be there. This means that the HTTP error code is effectively being ignored any time a URL is used.
The code in this area took a long time to understand and seems like it might be riddled with subtle bugs. I'm not sure of the best way to fix all of it. Being able to explicitly say that a remote resource should be fetched with git but over HTTP would be a great start, but it's not going to allow us to fix all of these problems because we'll need to maintain backwards compatibility for existing kustomization files that don't use the (currently deprecated) git::
prefix.
The Kubernetes project currently lacks enough contributors to adequately respond to all issues.
This bot triages un-triaged issues according to the following rules:
lifecycle/stale
is appliedlifecycle/stale
was applied, lifecycle/rotten
is appliedlifecycle/rotten
was applied, the issue is closedYou can:
/remove-lifecycle stale
/close
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle stale
/remove-lifecycle stale
The Kubernetes project currently lacks enough contributors to adequately respond to all issues.
This bot triages un-triaged issues according to the following rules:
lifecycle/stale
is appliedlifecycle/stale
was applied, lifecycle/rotten
is appliedlifecycle/rotten
was applied, the issue is closedYou can:
/remove-lifecycle stale
/close
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle stale
/remove-lifecycle stale
The Kubernetes project currently lacks enough contributors to adequately respond to all issues.
This bot triages un-triaged issues according to the following rules:
lifecycle/stale
is appliedlifecycle/stale
was applied, lifecycle/rotten
is appliedlifecycle/rotten
was applied, the issue is closedYou can:
/remove-lifecycle stale
/close
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle stale
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues.
This bot triages un-triaged issues according to the following rules:
lifecycle/stale
is appliedlifecycle/stale
was applied, lifecycle/rotten
is appliedlifecycle/rotten
was applied, the issue is closedYou can:
/remove-lifecycle rotten
/close
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle rotten
/remove-lifecycle rotten
Eschewed features
What would you like to have added?
The
git::
prefix was deprecated in #4954 because Kustomize had been ignoring it anyway, a sensible decision to make at the time. The problem is that there is no easy way to tell the difference between a remote resource that should be fetched straight via HTTPGET
from one that should be fetched using git but using HTTP as a transport. Instead we should consider adding thegit::
prefix back and encouraging its use to disambiguate these two types of remote resources.Why is this needed?
The code currently has to attempt to fetch via HTTP and when that fails it instead tries to fetch via git over HTTP. This results in poor error messages that are pretty tricky to fix, refer to #5382 for further details.
The complexity of the code here is likely a contributing factor behind some odd bugs like #4559 which resulted in a code freeze being announced in #4756 for this section of the code.
Can you accomplish the motivating task without this feature, and if so, how?
I'm struggling to find a sensible way to fix #5382 since there's no easy way to tell the difference between the two types of remote resources. If we could tell what the user intended, then we could return the correct error message (either an HTTP error or a git error).
What other solutions have you considered?
Since we can't easily infer whether it's a regular HTTP fetch or a git fetch over HTTP, we could keep the existing approach where we just try each of them and see which works. I think it would be more reliable to try git first because if the URL
https://github.com/kubernetes-sigs/kustomize//examples/multibases/base?ref=v3.3.1
allows us to fetch content from git then that was probably what was intended.Trying HTTP first for the URL returns a 404 in the case of this particular GitHub repo, but it's not guaranteed to do so. For GitHub Enterprise with anonymous auth disabled, a URL for a repo would return a 200 and the contents of the login page. Trying git first is likely a better choice but it's hard to predict the consequences of reordering this.
Anything else we should know?
Terraform uses a similar pattern for fetching remote modules. They likely ran into similar problems and I assume that this is where the convention of
git::
comes from:https://developer.hashicorp.com/terraform/language/modules/sources#generic-git-repository
Feature ownership