canonical / microk8s

MicroK8s is a small, fast, single-package Kubernetes for datacenters and the edge.
https://microk8s.io
Apache License 2.0
8.51k stars 772 forks source link

Enabling kubeflow behind a proxy fails #991

Closed Helumpago closed 2 years ago

Helumpago commented 4 years ago

I ran into a few issues when trying to run microk8s.enable kubeflow.

I'm using a Debian 9 machine behind a proxy.

First problem I ran into was that microk8s.enable couldn't download Juju:

$ microk8s.enable kubeflow
Enabling dns...
Enabling storage...
Enabling dashboard...
Enabling ingress...
Enabling rbac...
Enabling juju...
Kubeflow could not be enabled:
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0curl: (7) Failed to connect to launchpad.net port 443: Connection refused

Command '('microk8s-enable.wrapper', 'juju')' returned non-zero exit status 1
Failed to enable kubeflow

This was happening even after I told microk8s about the proxy.

I had to add the following to /etc/sudoers (using sudo visudo):

Defaults        env_keep="HTTP_PROXY HTTPS_PROXY http_proxy https_proxy NO_PROXY no_proxy"

This allowed the Juju install to complete successfully.

Suggestion 1: It doesn't look like the run_with_sudo function passes the proxy environment variables down to its sudo commands. It also doesn't seem like the commands that are run by run_with_sudo respect the proxy settings file. Could either of these options be implemented?


Once Juju installed correctly, I got the following error:

ERROR ensuring k8s credential "microk8s" with RBAC setup: ensuring cluster role "juju-credential-microk8s" in namespace "kube-system": Get https://192.168.148.132:16443/apis/rbac.authorization.k8s.io/v1/clusterroles/juju-credential-microk8s: Service Unavailable

I had to add 192.168.0.0/16 to my NO_PROXY environment variable. Once I did that, I was able to contact the Kubernetes server correctly.

Suggestion 2: Would it be possible to have your install script override the NO_PROXY variable so that you add whatever subnet you're configuring for Kubernetes?

Suggestion 3: This is only semi-related and probably doesn't belong in this issue, but it would be great to have a "verbose" mode for microk8s.enable so that we can get more information about where things are falling over.

Helumpago commented 4 years ago

Looks like the install ran into another problem. After making the above configuration changes, microk8s.enable kubeflow sat for a long time in the "Deploying Kubeflow..." step then crashed with the following message:

$ microk8s.enable kubeflow
Enabling dns...
Enabling storage...
Enabling dashboard...
Enabling ingress...
Enabling rbac...
Enabling juju...
Deploying Kubeflow...
Kubeflow could not be enabled:
Creating Juju controller "uk8s" on microk8s/localhost
Creating k8s resources for controller "controller-uk8s"
Downloading images
Starting controller pod
Bootstrap agent now started
Contacting Juju controller at 10.152.183.67 to verify accessibility...
ERROR unable to contact api server after 1 attempts: unable to connect to API: Service Unavailable

Command '('microk8s-juju.wrapper', 'bootstrap', 'microk8s', 'uk8s')' returned non-zero exit status 1
Failed to enable kubeflow

I added 10.0.0.0/8 to NO_PROXY.

After doing that, things got a bit further, but crashed here:

$ microk8s.enable kubeflow
Enabling dns...
Enabling storage...
Enabling dashboard...
Enabling ingress...
Enabling rbac...
Enabling juju...
Deploying Kubeflow...
Kubeflow could not be enabled:
Located bundle "cs:bundle/kubeflow-170"
Resolving charm: cs:~kubeflow-charmers/ambassador-54
Resolving charm: cs:~kubeflow-charmers/argo-controller-53
Resolving charm: cs:~kubeflow-charmers/argo-ui-54
Resolving charm: cs:~kubeflow-charmers/jupyter-controller-54
Resolving charm: cs:~kubeflow-charmers/jupyter-web-56
Resolving charm: cs:~kubeflow-charmers/katib-controller-52
Resolving charm: cs:~charmed-osm/mariadb-k8s
Resolving charm: cs:~kubeflow-charmers/katib-manager-50
Resolving charm: cs:~kubeflow-charmers/katib-ui-46
Resolving charm: cs:~kubeflow-charmers/kubeflow-dashboard-11
Resolving charm: cs:~kubeflow-charmers/kubeflow-gatekeeper-14
Resolving charm: cs:~kubeflow-charmers/kubeflow-login-13
Resolving charm: cs:~kubeflow-charmers/kubeflow-profiles-16
Resolving charm: cs:~kubeflow-charmers/metacontroller-43
Resolving charm: cs:~kubeflow-charmers/metadata-api-6
Resolving charm: cs:~charmed-osm/mariadb-k8s
Resolving charm: cs:~kubeflow-charmers/metadata-ui-10
Resolving charm: cs:~kubeflow-charmers/minio-54
Resolving charm: cs:~kubeflow-charmers/modeldb-backend-49
Resolving charm: cs:~charmed-osm/mariadb-k8s
Resolving charm: cs:~kubeflow-charmers/modeldb-store-45
Resolving charm: cs:~kubeflow-charmers/modeldb-ui-44
Resolving charm: cs:~kubeflow-charmers/pipelines-api-58
Resolving charm: cs:~charmed-osm/mariadb-k8s
Resolving charm: cs:~kubeflow-charmers/pipelines-persistence-53
Resolving charm: cs:~kubeflow-charmers/pipelines-scheduledworkflow-55
Resolving charm: cs:~kubeflow-charmers/pipelines-ui-53
Resolving charm: cs:~kubeflow-charmers/pipelines-viewer-55
Resolving charm: cs:~kubeflow-charmers/pytorch-operator-55
Resolving charm: cs:~kubeflow-charmers/tf-job-dashboard-55
Resolving charm: cs:~kubeflow-charmers/tf-job-operator-53
ERROR cannot deploy bundle: cannot add charm "cs:~kubeflow-charmers/ambassador-54": cannot retrieve charm "cs:~kubeflow-charmers/ambassador-54": cannot get archive: Get https://api.jujucharms.com/charmstore/v5/~kubeflow-charmers/ambassador-54/archive?channel=stable: dial tcp: lookup api.jujucharms.com on 10.152.183.10:53: server misbehaving

Command '('microk8s-juju.wrapper', 'deploy', 'cs:kubeflow', '--channel', 'stable', '--overlay', '/tmp/tmpo8d2nud_')' returned non-zero exit status 1
Failed to enable kubeflow

My theory at this point is that the proxy environment variables aren't being moved into whatever container is at 10.152.183.10, which means it can't access the outside world.

Any thoughts on how to fix this?

knkski commented 4 years ago

In regards to 2, you can set the KUBEFLOW_NO_PROXY environment variable to have Juju ignore a proxy for that IP range while bootstrapping, which seems to be what you're looking for.

For 3, you can set the environment variable KUBEFLOW_DEBUG=true to get verbose logging.

Additionally, #989 includes the Juju binary in the snap instead of downloading it, so you won't have that issue going forward as soon as that's merged.

Helumpago commented 4 years ago

Thanks, @knkski! Good to know there's options for this.

I don't suppose you have any suggestions on how to deal with 10.152.183.10 not being able to reach api.jujucharms.com?

knkski commented 4 years ago

@Helumpago: Is it able to reach api.jujucharms.com now? I would guess that you were just encountering a network issue.

evertonberz commented 4 years ago

Hi, I am also behind a proxy and juju download is working fine with channel edge now. However, I got a CONNECTnotallowed error while juju tries to contact the api server.

18:59:31 DEBUG juju.kubernetes.provider events.go:51 getting the latest event for "involvedObject.name=controller-0,involvedObject.kind=Pod"
18:59:31 INFO  cmd bootstrap.go:729 Starting controller pod
18:59:31 INFO  cmd bootstrap.go:613 Bootstrap agent now started
18:59:31 DEBUG juju.kubernetes.provider events.go:51 getting the latest event for "involvedObject.name=controller,involvedObject.kind=StatefulSet"
18:59:31 INFO  juju.juju api.go:302 API endpoints changed from [] to [10.152.183.21:17070]
18:59:31 INFO  cmd controller.go:89 Contacting Juju controller at 10.152.183.21 to verify accessibility...
18:59:31 INFO  juju.juju api.go:67 connecting to API addresses: [10.152.183.21:17070]
19:09:29 ERROR juju.cmd.juju.commands bootstrap.go:778 unable to contact api server after 1 attempts: unable to connect to API: CONNECTnotallowed
19:09:29 DEBUG juju.cmd.juju.commands bootstrap.go:779 (error details: [{/workspace/_build/src/github.com/juju/juju/cmd/juju/common/controller.go:128: unable to co     ntact api server after 1 attempts} {/workspace/_build/src/github.com/juju/juju/cmd/juju/common/controller.go:44: } {/workspace/_build/src/github.com/juju/juju/cmd/     modelcmd/modelcommand.go:405: } {/workspace/_build/src/github.com/juju/juju/cmd/modelcmd/modelcommand.go:424: } {/workspace/_build/src/github.com/juju/juju/cmd/mod     elcmd/base.go:214: } {/workspace/_build/src/github.com/juju/juju/juju/api.go:72: } {/workspace/_build/src/github.com/juju/juju/api/apiclient.go:207: } {/workspace/     _build/src/github.com/juju/juju/api/apiclient.go:622: } {/workspace/_build/src/github.com/juju/juju/api/apiclient.go:967: } {/workspace/_build/src/github.com/juju/     juju/api/apiclient.go:1071: unable to connect to API} {/workspace/_build/src/github.com/juju/juju/api/apiclient.go:1096: } {CONNECTnotallowed}])
19:09:29 DEBUG juju.cmd.juju.commands bootstrap.go:1424 cleaning up after failed bootstrap

I have already tried setting the KUBEFLOW_NO_PROXY but I got the same error. Steps to reproduce:

snap install microk8s --classic --channel=edge
# no proxy setting: 
# printf -v kube_no_proxy '%s,' 10.152.183.{1..255};
# export KUBEFLOW_NO_PROXY=${kube_no_proxy%,},localhost,127.0.0.1,::1
KUBEFLOW_DEBUG=true microk8s.enable kubeflow
blacksailer commented 4 years ago

I've met same issue. I've configured coredns to resolve address (as explained here ) but can't download through proxy

How to configure containerd? containerd-env file is not hepling

ktsakalozos commented 4 years ago

@blacksailer have you tried the instructions in https://microk8s.io/docs/install-proxy ? What is the state of your cluster? Can you attach an microk8s inspect tarball? Thank you. Apologies for the late reply.

blacksailer commented 4 years ago

Well, I removed it already and installed through kubeadm.

Yes, I've tried instructions in site, but juju was complaining on network, will check it later on week. Also tried to configure env variables for proxy (juju_no_proxy, juju_http_proxy) for juju, didn't help

gabrielecastellano commented 3 years ago

Hello, I am experiencing similar issues running behind a proxy. I have already followed the proxy configuration in https://microk8s.io/docs/install-proxy without any improvement. When enabling kubeflow, it stuck for a while on "deploying kubeflow" and eventually returns ERROR unable to contact api server after 1 attempts: Gateway Timeout when contacting juju controller.

It is not clear to me what should I add under the "NO_PROXY" variable.

Anyway, the problem seems bigger than kubeflow itself. After I deploy any pod, if I try to exec any command, it seems kubectl can't communicate with the pod at all: Error from server: error dialing backend: dial tcp 172.27.205.175:10250: i/o timeout

Is there any way of making this work today?

Thanks

knkski commented 3 years ago

@jameinel: do you know what values should be put in the no-proxy configuration?

stale[bot] commented 2 years ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.