Need proxy support in air-gapped environment

hanlins commented 3 years ago

User Story

As an operator, I would like to add proxy setting configurations to capi providers for the air-gapped environments.

Detailed Description

In air-gapped environment, cluster API provider pods might be deployed in air-gapped environment, and thus cannot talk to the infrastructure provider directly. In this scenario, a proxy server is typically deployed to enable the connectivity and audit the traffic that bypasses the firewall. It would be ideal if we can have a mechanism to plumb the proxy server configurations to the cluster API provider pods, so that they can be able to communicate with the infrastructure.

Anything else you would like to add: One approach I think think of is to have something like this:

HTTP_PROXY=xxx clusterctl init

The implementation should be similar to https://github.com/kubernetes/kubernetes/pull/84559.

[Miscellaneous information that will assist in solving the issue.]

/kind feature

enxebre commented 3 years ago

For such scenario we would also want the ability to configure https_proxy and no_proxy.

We'd need to flesh out details here, define and agree on what an air gapped env is and what scenarios and behaviour exactly we want to support end to end, e.g would this be a one shot thing? or would we want capi components to watch a "proxy config" and react to changes there? I think this will probably deserve a proposal having all the details.

fabriziopandini commented 3 years ago

@hanlins I'm starting to think about this use case, and my main concern is that adding proxy settings can't be achieved by simple variable substitution, which is the only templating solution supported in clusterctl as of today. The only two options I can see here are:

to rely on different templating solutions injected in the clusterctl library
use mutating web hooks

Also, the ongoing work on ManagedCluster might provide some help here, but this is still TBD IF this can help, I'm happy to chat about this

vincepri commented 3 years ago

/milestone Next

k8s-triage-robot commented 3 years ago

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Mark this issue or PR as fresh with /remove-lifecycle stale
Mark this issue or PR as rotten with /lifecycle rotten
Close this issue or PR with /close
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

k8s-triage-robot commented 2 years ago

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Mark this issue or PR as fresh with /remove-lifecycle rotten
Close this issue or PR with /close
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle rotten

k8s-triage-robot commented 2 years ago

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Reopen this issue or PR with /reopen
Mark this issue or PR as fresh with /remove-lifecycle rotten
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close

k8s-ci-robot commented 2 years ago

@k8s-triage-robot: Closing this issue.

In response to [this](https://github.com/kubernetes-sigs/cluster-api/issues/4585#issuecomment-985760605): >The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs. > >This bot triages issues and PRs according to the following rules: >- After 90d of inactivity, `lifecycle/stale` is applied >- After 30d of inactivity since `lifecycle/stale` was applied, `lifecycle/rotten` is applied >- After 30d of inactivity since `lifecycle/rotten` was applied, the issue is closed > >You can: >- Reopen this issue or PR with `/reopen` >- Mark this issue or PR as fresh with `/remove-lifecycle rotten` >- Offer to help out with [Issue Triage][1] > >Please send feedback to sig-contributor-experience at [kubernetes/community](https://github.com/kubernetes/community). > >/close > >[1]: https://www.kubernetes.dev/docs/guide/issue-triage/ Instructions for interacting with me using PR comments are available [here](https://git.k8s.io/community/contributors/guide/pull-requests.md). If you have questions or suggestions related to my behavior, please file an issue against the [kubernetes/test-infra](https://github.com/kubernetes/test-infra/issues/new?title=Prow%20issue:) repository.

joejulian commented 2 years ago

/reopen

We just encountered a customer that needs this, too.

It could be done through templating in cmd/clusterctl/client/repository.NewComponents with an option that contains the values for https_proxy, http_proxy, and no_proxy.

k8s-ci-robot commented 2 years ago

@joejulian: You can't reopen an issue/PR unless you authored it or you are a collaborator.

In response to [this](https://github.com/kubernetes-sigs/cluster-api/issues/4585#issuecomment-1027310851): >/reopen > >We just encountered a customer that needs this, too. > >It could be done through templating in [cmd/clusterctl/client/repository.NewComponents](https://github.com/kubernetes-sigs/cluster-api/blob/5756be13affef606b6683c2c413f0987c1f0e451/cmd/clusterctl/client/repository/components.go#L195) with an option that contains the values for https_proxy, http_proxy, and no_proxy. Instructions for interacting with me using PR comments are available [here](https://git.k8s.io/community/contributors/guide/pull-requests.md). If you have questions or suggestions related to my behavior, please file an issue against the [kubernetes/test-infra](https://github.com/kubernetes/test-infra/issues/new?title=Prow%20issue:) repository.

dlipovetsky commented 2 years ago

/reopen

k8s-ci-robot commented 2 years ago

@dlipovetsky: Reopened this issue.

In response to [this](https://github.com/kubernetes-sigs/cluster-api/issues/4585#issuecomment-1027328057): >/reopen Instructions for interacting with me using PR comments are available [here](https://git.k8s.io/community/contributors/guide/pull-requests.md). If you have questions or suggestions related to my behavior, please file an issue against the [kubernetes/test-infra](https://github.com/kubernetes/test-infra/issues/new?title=Prow%20issue:) repository.

dlipovetsky commented 2 years ago

/lifecycle frozen

sbueringer commented 2 years ago

/assign @ykakarap Can you please assess if it would be possible to extend clusterctl to inject http proxy env vars in the provider manifests.

fabriziopandini commented 2 years ago

/milestone v1.2

faiq commented 2 years ago

Hey I left a message on the #cluster-api slack channel to no avail :( Is it possible to get involved with the effort here? What's the criteria that we're going to be using to asses if this is possible or not? I'd love to see this feature happen so please let me know where I can help.

ykakarap commented 2 years ago

Catching up on the issue. Will get back soon. :)

@faiq I will take a look at this and post my findings here.

fabriziopandini commented 2 years ago

/triage accepted /unassign @ykakarap

@joejulian could you share how you fixed this problem as per https://github.com/kubernetes-sigs/cluster-api/issues/4585#issuecomment-1027310851 so someone can pick up the work in CAPI /help

k8s-ci-robot commented 2 years ago

@fabriziopandini: This request has been marked as needing help from a contributor.

Guidelines

Please ensure that the issue body includes answers to the following questions:

Why are we solving this issue?
To address this issue, are there any code changes? If there are code changes, what needs to be done in the code and what places can the assignee treat as reference points?
Does this issue have zero to low barrier of entry?
How can the assignee reach out to you for help?

For more details on the requirements of such an issue, please see here and ensure that they are met.

If this request no longer meets these requirements, the label can be removed by commenting with the /remove-help command.

In response to [this](https://github.com/kubernetes-sigs/cluster-api/issues/4585): >/triage accepted >/unassign @ykakarap > >@joejulian could you share how you fixed this problem as per https://github.com/kubernetes-sigs/cluster-api/issues/4585#issuecomment-1027310851 so someone can pick up the work in CAPI >/help > Instructions for interacting with me using PR comments are available [here](https://git.k8s.io/community/contributors/guide/pull-requests.md). If you have questions or suggestions related to my behavior, please file an issue against the [kubernetes/test-infra](https://github.com/kubernetes/test-infra/issues/new?title=Prow%20issue:) repository.

faiq commented 2 years ago

@fabriziopandini we modify the core-components.yaml file with this kustomization overlay

apiVersion: apps/v1
kind: Deployment
metadata:
  name: NA
spec:
  template:
    spec:
      containers:
        - name: manager
          env:
            - name: HTTP_PROXY
              value: ${HTTP_PROXY:=""}
            - name: HTTPS_PROXY
              value: ${HTTPS_PROXY:=""}
            - name: NO_PROXY
              value: ${NO_PROXY:=""}

dlipovetsky commented 2 years ago

Sounds like we have at least 3 options, in order from "least work required" to "most work required" from our users:

Include these env variables in the manifest for the core provider.
Document how to add these env variables by patching the manifest, e.g. with kustomize.
Document how to use a mutating webhook to set these env variables.

(In every cases, users need to include information like the Pods and Services CIDRs in the NO_PROXY variable, along with fixed values like localhost, etc.)

joejulian commented 2 years ago

@fabriziopandini I don't remember what we did (and I don't work there anymore so I can't go back and check).

joejulian commented 2 years ago

Sounds like we have at least 3 options, in order from "least work required" to "most work required" from our users:
1. Include these env variables in the manifest for the core provider.

2. Document how to add these env variables by patching the manifest, e.g. with kustomize.

3. Document how to use a mutating webhook to set these env variables.
(In every cases, users need to include information like the Pods and Services CIDRs in the NO_PROXY variable, along with fixed values like localhost, etc.)

I think it's obvious I support 1. :)

Seems odd that we'd rebuild this entire toolset around templating but this one bit we'd require using kustomize.
How would this webhook be installed without the proxy config?

fabriziopandini commented 2 years ago

I agree that adding env var to the manifest is the simplest way forward, my only concern is that in the past we got push-back for this type of change by folks using git-ops and trying to use yaml files directly (and in fact there is https://github.com/kubernetes-sigs/cluster-api/issues/3881 asking to remove all the variables we currently have).

joejulian commented 2 years ago

I've never been a fan of adding the complexity of templating to cluster-api a la ClusterClass, but the community felt the return was worth it. Embracing that change; I'm not sure, now, where the distinction lies between one form of templating and another. Is there a way to solve this that's more in line with ClusterClass, maybe?

sbueringer commented 2 years ago

Q: 1. Include these env variables in the manifest for the core provider.

In air-gapped environment, cluster API provider pods might be deployed in air-gapped environment, and thus cannot talk to the infrastructure provider directly.

Just for my understanding. For which connections do we need the http proxy configuration?

communication from CAPI to infra provider APIs (AWS,Azure,...)
communication from CAPI to workload clusters
both

I'm just a bit confused because the original ask was for the infra provider, but core CAPI is not accessing it. And having it consistently in infra providers would require agreement with infra providers (maybe an addition to the contract)

chrischdi commented 2 years ago

communication from workload clusters to endpoints (registry, internet, ...)

sbueringer commented 2 years ago

4. communication from workload clusters to endpoints (registry, internet, ...)

Should be probably from controllers / mgmt cluster to registry/internet?

I think the issue is about setting proxy for CAPI providers/controllers only (based on the PR description).

But based on the title it could be proxy support in general.

joejulian commented 2 years ago

I don't think you can add generalized proxy support. There's no standard.

enxebre commented 1 year ago

Sounds like we have at least 3 options, in order from "least work required" to "most work required" from our users: Include these env variables in the manifest for the core provider. Document how to add these env variables by patching the manifest, e.g. with kustomize. Document how to use a mutating webhook to set these env variables. (In every cases, users need to include information like the Pods and Services CIDRs in the NO_PROXY variable, along with fixed values like localhost, etc.)

Agreed, at minimum we could provide some guidance docs

/kind documentation

fabriziopandini commented 5 months ago

/priority backlog

kubernetes-sigs / cluster-api

Need proxy support in air-gapped environment #4585

Guidelines