canonical / microk8s

MicroK8s is a small, fast, single-package Kubernetes for datacenters and the edge.
https://microk8s.io
Apache License 2.0
8.55k stars 773 forks source link

microk8s is not running. microk8s.inspect showing no error #886

Open ibigbug opened 4 years ago

ibigbug commented 4 years ago

Please run microk8s.inspect and attach the generated tarball to this issue.

wtf@k8s-master:~$ microk8s.inspect Inspecting services Service snap.microk8s.daemon-cluster-agent is running Service snap.microk8s.daemon-flanneld is running Service snap.microk8s.daemon-containerd is running Service snap.microk8s.daemon-apiserver is running Service snap.microk8s.daemon-apiserver-kicker is running Service snap.microk8s.daemon-proxy is running Service snap.microk8s.daemon-kubelet is running Service snap.microk8s.daemon-scheduler is running Service snap.microk8s.daemon-controller-manager is running Service snap.microk8s.daemon-etcd is running Copy service arguments to the final report tarball Inspecting AppArmor configuration Gathering system information Copy processes list to the final report tarball Copy snap list to the final report tarball Copy VM name (or none) to the final report tarball Copy disk usage information to the final report tarball Copy memory usage information to the final report tarball Copy server uptime to the final report tarball Copy current linux distribution to the final report tarball Copy openSSL information to the final report tarball Copy network configuration to the final report tarball Inspecting kubernetes cluster Inspect kubernetes cluster

Building the report tarball Report tarball is at /var/snap/microk8s/1107/inspection-report-20200102_011315.tar.gz

inspection-report-20200102_011315.tar.gz

wtf@k8s-master:~$ microk8s.status microk8s is not running. Use microk8s.inspect for a deeper inspection.

We appreciate your feedback. Thank you for using microk8s.

balchua commented 4 years ago

Your apiserver is complaining about an invalid bearer token.

Jan 02 01:13:06 k8s-master.syd.home microk8s.daemon-apiserver[4971]: E0102 01:13:06.280497    4971 authentication.go:104] Unable to authenticate the request due to an error: invalid bearer token
Jan 02 01:13:06 k8s-master.syd.home microk8s.daemon-apiserver[4971]: E0102 01:13:06.453439    4971 authentication.go:104] Unable to authenticate the request due to an error: invalid bearer token

Was this a fresh installation?

ibigbug commented 4 years ago

@balchua no it's not. I rebooted the machine after it's been running for a while

balchua commented 4 years ago

Thanks @ibigbug can you try to restart microk8s? microk8s.stop then microk8s.start to see if it resolve the issue?

ibigbug commented 4 years ago

@balchua not seem working

wtf@k8s-master:~$ microk8s.stop
[sudo] password for wtf:
Stopped.
wtf@k8s-master:~$ microk8s.start
Started.
Enabling pod scheduling
wtf@k8s-master:~$ microk8s.status
microk8s is not running. Use microk8s.inspect for a deeper inspection.
wtf@k8s-master:~$ 
ibigbug commented 4 years ago

pod status

admin@k8s-master:~$ kubectl get po -n kube-system
NAME                                              READY   STATUS        RESTARTS   AGE
coredns-9b8997588-kldbv                           0/1     Pending       0          2d14h
coredns-9b8997588-xllr9                           0/1     Terminating   0          14d
dashboard-metrics-scraper-687667bb6c-kg6zd        1/1     Terminating   0          14d
dashboard-metrics-scraper-687667bb6c-sqdj4        0/1     Pending       0          2d13h
filebeat-p6nfk                                    1/1     Running       0          14d
filebeat-w55z9                                    1/1     Running       1          14d
heapster-v1.5.2-5c58f64f8b-4dfw2                  4/4     Terminating   0          14d
heapster-v1.5.2-5c58f64f8b-v5699                  0/4     Pending       0          2d13h
hostpath-provisioner-7b9cb5cdb4-f7jh7             1/1     Terminating   0          14d
hostpath-provisioner-7b9cb5cdb4-wgmwq             0/1     Pending       0          2d13h
kubernetes-dashboard-5c848cc544-4rlxr             1/1     Terminating   1          14d
kubernetes-dashboard-5c848cc544-j2vzv             0/1     Pending       0          2d14h
metricbeat-55f4fc45cb-5whm2                       1/1     Terminating   1          14d
metricbeat-55f4fc45cb-l49zf                       0/1     Pending       0          2d14h
metricbeat-cw92z                                  1/1     Running       0          14d
metricbeat-kkq8s                                  1/1     Running       1          14d
monitoring-influxdb-grafana-v4-6d599df6bf-lzqtw   2/2     Terminating   2          14d
monitoring-influxdb-grafana-v4-6d599df6bf-pfsdx   0/2     Pending       0          2d14h
balchua commented 4 years ago

Are you running multi nodes?

ibigbug commented 4 years ago

yes 1 master + 1 follower

balchua commented 4 years ago

Can you go to the worker/follower node and do a microk8s.stop and microk8s.start?

ibigbug commented 4 years ago

it doesn't actually allow me:

admin@k8s-node1:~$ microk8s.stop
This MicroK8s deployment is acting as a node in a cluster. Please use the microk8s.stop on the master.
balchua commented 4 years ago

Is it possible to make it a single node cluster to see if it is still running? I think you may need to do microk8s.leave or microk8s.remove-node something like that.

ibigbug commented 4 years ago

still not working. maybe I'll just reinstall..

balchua commented 4 years ago

You may want to pin it to a particular channel ex. 1.16 stable.

ktsakalozos commented 4 years ago

@ibigbug I see that the kubelets cannot register with the apiserver. The last time they registered with the API server was on the 22nd of Dec. The error you have looks like this:

Jan 02 01:13:06 k8s-master.syd.home microk8s.daemon-kubelet[9551]: E0102 01:13:06.297471    9551 kubelet.go:2263] node "k8s-master.syd.home" not found
Jan 02 01:13:06 k8s-master.syd.home microk8s.daemon-kubelet[9551]: E0102 01:13:06.399105    9551 kubelet.go:2263] node "k8s-master.syd.home" not found
Jan 02 01:13:06 k8s-master.syd.home microk8s.daemon-kubelet[9551]: E0102 01:13:06.487449    9551 reflector.go:156] k8s.io/kubernetes/pkg/kubelet/kubelet.go:458: Failed to list *v1.Node: Unauthorized

Any idea what might have changed around then?

ibigbug commented 4 years ago

if it's saying node not found, might be due to the reboot of VM?

pankajxyz commented 4 years ago

I also had the same issue. It happens with v1.17 only (other versions like v1.16, v1.15, v1.14) are ok. Also, it happens with v1.17 after I try to install kubeflow using microk8s.enable kubeflow which basically throws an error about Juju. To resolve that I did install Juju and lxd and did juju bootstrap after this microk8s.status gives me microk8s not running.

I reproduced this behaviour in another machine as well.

TribalNightOwl commented 4 years ago

Same error. Running single node. microk8s version: installed: v1.17.2 (1173) 179MB classic

$ microk8s.start
Started.
Enabling pod scheduling
$ microk8s.status
microk8s is not running. Use microk8s.inspect for a deeper inspection.
$ microk8s.inspect 
Inspecting services
  Service snap.microk8s.daemon-cluster-agent is running
  Service snap.microk8s.daemon-flanneld is running
  Service snap.microk8s.daemon-containerd is running
  Service snap.microk8s.daemon-apiserver is running
  Service snap.microk8s.daemon-apiserver-kicker is running
  Service snap.microk8s.daemon-proxy is running
  Service snap.microk8s.daemon-kubelet is running
  Service snap.microk8s.daemon-scheduler is running
  Service snap.microk8s.daemon-controller-manager is running
  Service snap.microk8s.daemon-etcd is running
  Copy service arguments to the final report tarball
Inspecting AppArmor configuration
Gathering system information
  Copy processes list to the final report tarball
  Copy snap list to the final report tarball
  Copy VM name (or none) to the final report tarball
  Copy disk usage information to the final report tarball
  Copy memory usage information to the final report tarball
  Copy server uptime to the final report tarball
  Copy current linux distribution to the final report tarball
  Copy openSSL information to the final report tarball
  Copy network configuration to the final report tarball
Inspecting kubernetes cluster
  Inspect kubernetes cluster

Building the report tarball
  Report tarball is at /var/snap/microk8s/1173/inspection-report-20200202_114517.tar.gz

inspection-report-20200202_114517.tar.gz

TribalNightOwl commented 4 years ago

Either removing and re-installing fixed the issue or the version: installed: v1.17.0 (1109) 179MB classic

$ snap remove microk8s 
microk8s removed

$ microk8s.status
bash: /snap/bin/microk8s.status: No such file or directory

$ sudo snap install microk8s --classic --channel=1.17/stable

microk8s (1.17/stable) v1.17.0 from Canonical✓ installed

$ microk8s.start
Started.
Enabling pod scheduling
node/blushy already uncordoned

$ microk8s.status
microk8s is running
addons:
cilium: disabled
dashboard: disabled
dns: disabled
fluentd: disabled
gpu: disabled
helm: disabled
ingress: disabled
istio: disabled
jaeger: disabled
juju: disabled
knative: disabled
kubeflow: disabled
linkerd: disabled
metallb: disabled
metrics-server: disabled
prometheus: disabled
rbac: disabled
registry: disabled
storage: disabled
TribalNightOwl commented 4 years ago

After several delete and re-installs, I narrowed it down to microk8s dying the moment I try to change the context to use.

I enabled DNS, then created two namespaces, then two contexts, I checked the status of microk8s after each command and it was running.

$ kubectl get namespaces 
NAME                 STATUS   AGE
default              Active   52s
jenkinsmaster-dev    Active   5s
jenkinsmaster-prod   Active   5s
kube-node-lease      Active   66s
kube-public          Active   66s
kube-system          Active   67s

$ kubectl config view
apiVersion: v1
clusters:
- cluster:
    certificate-authority-data: DATA+OMITTED
    server: https://127.0.0.1:16443
  name: microk8s-cluster
contexts:
- context:
    cluster: microk8s
    namespace: jenkinsmaster-dev
    user: admin
  name: jenkinsmaster-dev
- context:
    cluster: microk8s
    namespace: jenkinsmaster-prod
    user: admin
  name: jenkinsmaster-prod
- context:
    cluster: microk8s-cluster
    user: admin
  name: microk8s
current-context: microk8s
kind: Config
preferences: {}
users:
- name: admin
  user:
    password: bCtlMTl6dUhSVXlFb1hVRXpYcWs0QUpzbFc4dFpPd2hsb3U4UVA0UFo0VT0K
    username: admin

$ kubectl config current-context 
microk8s

After I did:

$ kubectl config use-context jenkinsmaster-dev 
Switched to context "jenkinsmaster-dev".

$ microk8s.status
microk8s is not running. Use microk8s.inspect for a deeper inspection.
balchua commented 4 years ago

@TribalNightOwl thanks for the info. When you added the context, did u add it to the file /var/snap/microk8s/current/credentials/client.config? And the kubectl you are using is an alias? Thanks again.

balchua commented 4 years ago

@TribalNightOwl your context jenkinsmaster-dev and jenkinsmaster-prod is pointing to a non existing cluster microk8s. It should be microk8s-cluster.

TribalNightOwl commented 4 years ago

@TribalNightOwl thanks for the info. When you added the context, did u add it to the file /var/snap/microk8s/current/credentials/client.config?

No, I just used these commands:

microk8s.kubectl config set-context jenkinsmaster-dev --namespace=jenkinsmaster-dev   --cluster=microk8s   --user=admin

microk8s.kubectl config set-context jenkinsmaster-prod --namespace=jenkinsmaster-prod   --cluster=microk8s   --user=admin

And the kubectl you are using is an alias?

yes:

alias kubectl='microk8s.kubectl'
TribalNightOwl commented 4 years ago

@TribalNightOwl your context jenkinsmaster-dev and jenkinsmaster-prod is pointing to a non existing cluster microk8s. It should be microk8s-cluster.

I will try again and report back. Although I would argue that microk8s shouldn't stop running (and refuse to start) due to this.

TribalNightOwl commented 4 years ago
$ snap install microk8s --classic
microk8s v1.17.2 from Canonical✓ installed

$ microk8s.enable dns
Enabling DNS
Applying manifest
serviceaccount/coredns created
configmap/coredns created
deployment.apps/coredns created
service/kube-dns created
clusterrole.rbac.authorization.k8s.io/coredns created
clusterrolebinding.rbac.authorization.k8s.io/coredns created
Restarting kubelet
[sudo] password for hugo: 
DNS is enabled

$ kubectl apply -f namespaces.yaml 
namespace/jenkinsmaster-dev created
namespace/jenkinsmaster-prod created

$ kubectl get namespaces 
NAME                 STATUS   AGE
default              Active   94s
jenkinsmaster-dev    Active   4s
jenkinsmaster-prod   Active   4s
kube-node-lease      Active   107s
kube-public          Active   107s
kube-system          Active   108s

$ microk8s.kubectl config set-context jenkinsmaster-dev --namespace=jenkinsmaster-dev \
>   --cluster=microk8s-cluster \
>   --user=admin
Context "jenkinsmaster-dev" created.

$ microk8s.kubectl config set-context jenkinsmaster-prod --namespace=jenkinsmaster-prod \
>   --cluster=microk8s-cluster \
>   --user=admin
Context "jenkinsmaster-prod" created.

$ kubectl config view
apiVersion: v1
clusters:
- cluster:
    certificate-authority-data: DATA+OMITTED
    server: https://127.0.0.1:16443
  name: microk8s-cluster
contexts:
- context:
    cluster: microk8s-cluster
    namespace: jenkinsmaster-dev
    user: admin
  name: jenkinsmaster-dev
- context:
    cluster: microk8s-cluster
    namespace: jenkinsmaster-prod
    user: admin
  name: jenkinsmaster-prod
- context:
    cluster: microk8s-cluster
    user: admin
  name: microk8s
current-context: microk8s
kind: Config
preferences: {}
users:
- name: admin
  user:
    password: ZytkS1o5NVZhZWRTU0t3NnNReFhHaHpRcHRaaUxkaG1XNWFBTXFPbVNNaz0K
    username: admin

$ kubectl config use-context jenkinsmaster-dev 
Switched to context "jenkinsmaster-dev".

$ microk8s.status
microk8s is running
addons:
cilium: disabled
dashboard: disabled
dns: enabled
fluentd: disabled
gpu: disabled
helm3: disabled
helm: disabled
ingress: disabled
istio: disabled
jaeger: disabled
juju: disabled
knative: disabled
kubeflow: disabled
linkerd: disabled
metallb: disabled
metrics-server: disabled
prometheus: disabled
rbac: disabled
registry: disabled
storage: disabled

BINGO! it didn't die this time.

$ kubectl config current-context 
jenkinsmaster-dev
balchua commented 4 years ago

@TribalNightOwl microk8s is not actually dying. The status command uses the kubeconfig settings to verify the cluster's health. So if the kubeconfig is misconfigured it will not be able to gather kubernetes resources, hence it will say not running

The message can be misleading though.

TribalNightOwl commented 4 years ago
$ microk8s.kubectl config set-context jenkinsmaster-dev --namespace=jenkinsmaster-fail   --cluster=microk8s   --user=admin

$ kubectl config view
apiVersion: v1
clusters:
- cluster:
    certificate-authority-data: DATA+OMITTED
    server: https://127.0.0.1:16443
  name: microk8s-cluster
contexts:
- context:
    cluster: microk8s
    namespace: jenkinsmaster-fail
    user: admin
  name: jenkinsmaster-dev
- context:
    cluster: microk8s-cluster
    namespace: jenkinsmaster-prod
    user: admin
  name: jenkinsmaster-prod
- context:
    cluster: microk8s-cluster
    user: admin
  name: microk8s
current-context: jenkinsmaster-dev
kind: Config
preferences: {}
users:
- name: admin
  user:
    password: ZytkS1o5NVZhZWRTU0t3NnNReFhHaHpRcHRaaUxkaG1XNWFBTXFPbVNNaz0K
    username: admin

$ microk8s.status
microk8s is not running. Use microk8s.inspect for a deeper inspection.
sudo vi /var/snap/microk8s/current/credentials/client.config

Manually deleted this section:

- context:
    cluster: microk8s
    namespace: jenkinsmaster-fail
    user: admin
  name: jenkinsmaster-dev

$ kubectl config view
apiVersion: v1
clusters:
- cluster:
    certificate-authority-data: DATA+OMITTED
    server: https://127.0.0.1:16443
  name: microk8s-cluster
contexts:
- context:
    cluster: microk8s-cluster
    namespace: jenkinsmaster-prod
    user: admin
  name: jenkinsmaster-prod
- context:
    cluster: microk8s-cluster
    user: admin
  name: microk8s
current-context: jenkinsmaster-dev
kind: Config
preferences: {}
users:
- name: admin
  user:
    password: ZytkS1o5NVZhZWRTU0t3NnNReFhHaHpRcHRaaUxkaG1XNWFBTXFPbVNNaz0K
    username: admin

$ microk8s.status
microk8s is not running. Use microk8s.inspect for a deeper inspection.
TribalNightOwl commented 4 years ago

Hold on, I got it:

My previous was still pointing to something non-existent.

I did:

 kubectl config use-context jenkinsmaster-prod
Switched to context "jenkinsmaster-prod".

$ microk8s.status
microk8s is running
addons:
cilium: disabled
dashboard: disabled
dns: enabled
fluentd: disabled
gpu: disabled
helm3: disabled
helm: disabled
ingress: disabled
istio: disabled
jaeger: disabled
juju: disabled
knative: disabled
kubeflow: disabled
linkerd: disabled
metallb: disabled
metrics-server: disabled
prometheus: disabled
rbac: disabled
registry: disabled
storage: disabled

That completely proves your previous comment, thanks!

TribalNightOwl commented 4 years ago

How about changing the message:

Currently:

microk8s is not running. Use microk8s.inspect for a deeper inspection.

New:

microk8s is not running. Verify your config is valid and use microk8s.inspect for a deeper inspection.

Or:

microk8s is not running in cluster $CLUSTERNAME. Use microk8s.inspect for a deeper inspection.

Something that would make the user think about having a misconfigured client and not necessarily microk8s dying.

ktsakalozos commented 4 years ago

How about changing the message

We could also detect such problems and suggest a fix in microk8s.inspect https://github.com/ubuntu/microk8s/blob/master/scripts/inspect.sh#L106

gavinB-orange commented 4 years ago

On my system I found that the problem went away after I updated the rather too old kubectl installed in /usr/local/bin on my system. I had assumed that microk8s would exclusively use it's own kubectl, but apparently not.

antsankov commented 4 years ago

Solved it for me @gavinB-orange - had to remove my previously installed kubectl and then it microk8s started working!

rm -rf /usr/local/bin/kubectl

impravin22 commented 4 years ago

I have solved the issue. Do not run in root mode. Try running in user mode.

k8s-master@k8s-master:-$ sudo microk8s.status

ashrr108 commented 4 years ago

Try refreshing the certificate

sudo microk8s refresh-certs

then check for microk8s status this must work

Credentive-Sec commented 4 years ago

Try refreshing the certificate

sudo microk8s refresh-certs

then check for microk8s status this must work

This worked for me. Thank you!

ddombrowsky commented 3 years ago

Looks like this is a case where success does not mean success. For me, microk8s start was failing silently with no error message. Grepping line-by-line through microk8s inspect shows:

Apr 06 15:55:29 smokey01 microk8s.daemon-apiserver[353544]: I0406 15:55:29.905653  353544 server.go:630] external host was not specified, using 10.1.1.103
Apr 06 15:55:29 smokey01 microk8s.daemon-apiserver[353544]: W0406 15:55:29.905810  353544 authentication.go:519] AnonymousAuth is not allowed with the AlwaysAllow authorizer. Resetting 
AnonymousAuth to false. You should use a different authorizer
Apr 06 15:55:31 smokey01 microk8s.daemon-apiserver[353544]: Error: listen to 192.168.1.68:19001: listen tcp 192.168.1.68:19001: bind: cannot assign requested address
Apr 06 15:55:31 smokey01 systemd[1]: snap.microk8s.daemon-apiserver.service: Main process exited, code=exited, status=1/FAILURE
Apr 06 15:55:31 smokey01 systemd[1]: snap.microk8s.daemon-apiserver.service: Failed with result 'exit-code'.
Apr 06 15:55:32 smokey01 systemd[1]: snap.microk8s.daemon-apiserver.service: Scheduled restart job, restart counter is at 4306.
Apr 06 15:55:32 smokey01 systemd[1]: Stopped Service for snap application microk8s.daemon-apiserver.
Apr 06 15:55:32 smokey01 systemd[1]: Started Service for snap application microk8s.daemon-apiserver.

Looks like my ip address changed and now microk8s is hosed. I can just re-install the whole thing but I'd really like to figure this out.

balchua commented 3 years ago

Based on the logs, port 19001 is used.

ddombrowsky commented 3 years ago

Based on the logs, port 19001 is used.

The problem is the IP address. The box no longer has an interface with 192.168.1.68 so the bind will fail. I couldn't figure out where this was set, so I gave up and ended up uninstalling and reinstalling microk8s with snap.

Sure would like to know how to actually fix that in a production environment, though. Seems rather important.

balchua commented 3 years ago

Thanks for clarifying @ddombrowsky. It is a multi node cluster right? @ktsakalozos, @mathieubordere does dqlite support a node changing its IP address while maintaining to be member of the cluster?

MathieuBordere commented 3 years ago

Thanks for clarifying @ddombrowsky. It is a multi node cluster right? @ktsakalozos, @MathieuBordere does dqlite support a node changing its IP address while maintaining to be member of the cluster?

The underlying raft implementation supports configuration changes. The way to do I think, is to remove the node from the cluster and add it again with its new address.

MatrixManAtYrService commented 3 years ago

On this advice I disabled swap and then stopped/started microk8s. After that it worked again.

romanchyla commented 3 years ago

Similarly do @ddombrowsky two of my nodes were reporting not running -- and one pod was stuck terminating when I unplugged the network interfaces. The boxes previously had multiple NICs so it took a while to realize what was going on (the message 'microk8s is not running` was leading me astray)

Once I plugged the NICs back, one of the nodes started behaving normally; on the other - which in the meantime was reinstalled - I had to do:

microk8s leave followed by microk8s.remove-node xxx (on master node) -- since microk8s.kubect get no -o wide was still reporting this node to be present NotReady and then I did microk8s.add-node + microk8s.join .... on that 'sick' node

after that, microk8s.status was good again

I believe this procedure will work for others, because as I said, one of the nodes was in the meantime reformatted; it got a new OS

guoqiao commented 3 years ago

Solved it for me @gavinB-orange - had to remove my previously installed kubectl and then it microk8s started working!

rm -rf /usr/local/bin/kubectl

sudo snap remove microk8s and rm -rf /usr/local/bin/kubectl fixed the issue for me. I guess this is because I was using minikube before. when switch to microk8s, I have to delete this dir. Can any one else prove this?

j-cunanan commented 3 years ago

Same error. It worked the first time but could not recognize my RTX A6000 GPU so I redo the process here and got my GPU running on a fresh cluster.

But now microk8s was not working. and inspect gives

Inspecting services
  Service snap.microk8s.daemon-cluster-agent is running
  Service snap.microk8s.daemon-flanneld is running
  Service snap.microk8s.daemon-containerd is running
  Service snap.microk8s.daemon-apiserver is running
  Service snap.microk8s.daemon-apiserver-kicker is running
  Service snap.microk8s.daemon-proxy is running
  Service snap.microk8s.daemon-kubelet is running
  Service snap.microk8s.daemon-scheduler is running
  Service snap.microk8s.daemon-controller-manager is running
  Service snap.microk8s.daemon-etcd is running
  Copy service arguments to the final report tarball
Inspecting AppArmor configuration
Gathering system information
  Copy processes list to the final report tarball
  Copy snap list to the final report tarball
  Copy VM name (or none) to the final report tarball
  Copy disk usage information to the final report tarball
  Copy memory usage information to the final report tarball
  Copy server uptime to the final report tarball
  Copy current linux distribution to the final report tarball
  Copy openSSL information to the final report tarball
  Copy network configuration to the final report tarball
Inspecting kubernetes cluster
  Inspect kubernetes cluster

Building the report tarball
  Report tarball is at 

/var/snap/microk8s/1671/inspection-report-20210713_012322.tar.gz

inspection-report-20210713_012322.tar.gz

I've tried reinstalling older versions, doing microk8s.reset , with and without sudo swapoff -a but no luck.

Thanks in advance!

balchua commented 3 years ago

@j-cunanan kube-proxy isn't running right. You may already have a process Running on pprt 10249.

As shown in this error

 7月 13 01:23:17 user-desktop microk8s.daemon-proxy[335469]: E0713 01:23:17.225866  335469 server.go:564] starting metrics server failed: listen tcp 127.0.0.1:10249: bind: address already in use
j-cunanan commented 3 years ago

@balchua It seems like my installation did not work because a k8s cluster was already running. Is this a known issue? Or are there additional steps needed when installing microk8s when there is an existing cluster.

I removed my previous k8s installation and now microk8s is running properly.

balchua commented 3 years ago

Yes definitely will run into port conflicts whether u have an already k8s running or an application using the ports. The list of ports and services are here. https://microk8s.io/docs/services-and-ports

j-cunanan commented 3 years ago

Thanks for you quick reply @balchua !

I got the KFP running and tried the default Xgboost sample but my run does not show in the dashboard.

I was executed properly though according to microk8s kubectl get pods -A

kubeflow                 train-until-good-pipeline-qskms-832503860                     0/2     Completed          0          82s
kubeflow                 train-until-good-pipeline-qskms-3307472060                    0/2     Completed          0          58s
kubeflow                 train-until-good-pipeline-qskms-831860599                     0/2     Completed          0          31s
kubeflow                 train-until-good-pipeline-qskms-2281756174                    0/2     Completed          0          58s
kubeflow                 train-until-good-pipeline-qskms-364220547                     2/2     Running            0          20s

My Runs dashboard is just empty.

balchua commented 3 years ago

@j-cunanan can you create a new issue with regards to your kubeflow issue? Don't know anything about that.

j-cunanan commented 3 years ago

Will do, thanks!

RaghuMeda commented 3 years ago

I had same issue and figured out that it is issue due to firewall. Enable all the required firewall ports as required for microk8s and microk8s starts running as required.

alexgleason commented 3 years ago

Hi, my server got rebooted and now my MicroK8s VMs no longer work... it was running social media sites and I have users who want to log back on. Could you please help me? I've tried restarting the server, tried running microk8s stop and microk8s start. Tried some of the suggestions in this thread. Nothing works.

There are two nodes running MicroK8s. Here's output from one of them:

tribes@tribes-doge:~$ microk8s inspect
Inspecting Certificates
Inspecting services
  Service snap.microk8s.daemon-cluster-agent is running
  Service snap.microk8s.daemon-containerd is running
 FAIL:  Service snap.microk8s.daemon-apiserver is not running
For more details look at: sudo journalctl -u snap.microk8s.daemon-apiserver
  Service snap.microk8s.daemon-apiserver-kicker is running
  Service snap.microk8s.daemon-control-plane-kicker is running
  Service snap.microk8s.daemon-proxy is running
  Service snap.microk8s.daemon-kubelet is running
  Service snap.microk8s.daemon-scheduler is running
  Service snap.microk8s.daemon-controller-manager is running
  Copy service arguments to the final report tarball
Inspecting AppArmor configuration
Gathering system information
  Copy processes list to the final report tarball
  Copy snap list to the final report tarball
  Copy VM name (or none) to the final report tarball
  Copy disk usage information to the final report tarball
  Copy memory usage information to the final report tarball
  Copy server uptime to the final report tarball
  Copy current linux distribution to the final report tarball
  Copy openSSL information to the final report tarball
  Copy network configuration to the final report tarball
Inspecting kubernetes cluster
  Inspect kubernetes cluster
Inspecting juju
  Inspect Juju
Inspecting kubeflow
  Inspect Kubeflow

Building the report tarball
  Report tarball is at /var/snap/microk8s/2361/inspection-report-20210810_014245.tar.gz

inspection-report-20210810_014245.tar.gz

Any ideas?