imixs / imixs-cloud

A Lightweight Kubernetes Environment
https://imixs.github.io/imixs-cloud/
GNU General Public License v3.0
200 stars 82 forks source link

Traefik not accessible through http://master-ip:8100 #50

Closed rafrasenberg closed 3 years ago

rafrasenberg commented 3 years ago

First of all, many thanks for this repo. It's very good!

I am trying to set this up with 3 Ubuntu 20.04 AWS EC2 instances. I added a security group and made sure http and that port is accesible. Even added a Elastic Static IP. My Traefik pod is running but I can't access it through http://master-ip:8100. When I add the ingress route I also can't access it.

Any tips on what might be blocking this?

Because this is looking fine:

NAME       TYPE           CLUSTER-IP      EXTERNAL-IP     PORT(S)                                     AGE
kube-dns   ClusterIP      10.96.0.10      <none>          53/UDP,53/TCP,9153/TCP                      87m
traefik    LoadBalancer   10.97.208.207   18.158.83.122   80:30650/TCP,443:30866/TCP,8100:30026/TCP   9m3s

But 18.158.83.122:8100 doesn't give me the dashboard.

rsoika commented 3 years ago

In the service definition there is the following entry:

spec:
  externalIPs:
  - {MASTER-NODE-IP}
  externalTrafficPolicy: Cluster

replacing 'externalIPs' with your master node should allow you to access the traefik UI. But maybe your issue is different. Is traefik in general working for your (you can test the whoami example) and is it just the UI which is not working? Have you take a look into the traefik log file during startup. Maybe it prints something interessting? You can also try to remove the 030-ingress.yaml for first setup because this is just a convenience feature.

rafrasenberg commented 3 years ago

Thank you for the quick reply @rsoika ! I indeed followed your docs and replaced that value. In the Traefik logs I get a "connection refused" , but my full firewall on server is turned off and in AWS I even set all my ports to open. It's very odd.

I do have to note this, in the readme that you provided you stated the following command:

sudo kubeadm init --pod-network-cidr=10.244.0.0/16 --apiserver-advertise-address=[NODE_IP_ADDRESS]

But that results in this error:

[kubelet-check] Initial timeout of 40s passed.

        Unfortunately, an error has occurred:
                timed out waiting for the condition

        This error is likely caused by:
                - The kubelet is not running
                - The kubelet is unhealthy due to a misconfiguration of the node in some way (required cgroups disabled)

        If you are on a systemd-powered system, you can try to troubleshoot the error with the following commands:
                - 'systemctl status kubelet'
                - 'journalctl -xeu kubelet'

        Additionally, a control plane component may have crashed or exited when started by the container runtime.
        To troubleshoot, list all containers using your preferred container runtimes CLI.

        Here is one example how you may list all Kubernetes containers running in docker:
                - 'docker ps -a | grep kube | grep -v pause'
                Once you have found the failing container, you can inspect its logs with:
                - 'docker logs CONTAINERID'

error execution phase wait-control-plane: couldn't initialize a Kubernetes cluster
To see the stack trace of this error execute with --v=5 or higher

So you can only run it without the --apiserver-advertise-address=[NODE_IP_ADDRESS] I suppose this may be the reason why my setup isn't working.

So I thought maybe it's because I am running Ubuntu rather than Debian as per your example (modified the install script as well to work for Ubuntu).

However for a sanity check I just ran the setup on Debian 10 Buster, and I am getting the same error. Do you maybe know how that is possible? You can check this as well, just spin up a Debian 10 Buster EC2 instance and then run your install script. This executes perfectly fine. But if you then try to use sudo kubeadm init --pod-network-cidr=10.244.0.0/16 --apiserver-advertise-address=[NODE_IP_ADDRESS] with the public IP address of the AWS EC2 instance, it ends up with an error. Is this something AWS specific or did this command change in a newer version of Kubernetes?

rafrasenberg commented 3 years ago

Small update, I have gotten it sort of to work..

When using Ubuntu 20.04 on and AWS EC2 instance with the master node assigned to an AWS Elastic IP and using this init command:

sudo kubeadm init --pod-network-cidr=10.244.0.0/16 --apiserver-advertise-address={private_master_ip} --apiserver-cert-extra-sans={private_master_ip},18.158.83.122

Based on this Stackoverflow answer

The dashboard is now available at:

http://18.158.83.122:32502/dashboard

As you can see, which is very weird. The port forwarding does not seem to work because http://18.158.83.122:8100 is not available. I am using the configuration in the management folder of your repo, didn't change anything in it except the email address for ACME and the external IP.

When I'm adding an ingress route for the dashboard, it is available on an address like this: http://domain.com:32502/dashboard. So only available when specifying the node port explicitly, because port 80 gives me connection refused.

Any help would be appreciated!

rsoika commented 3 years ago

I think the AWS platform is concerning the network complete different in compare to a self hosted nodes. The --pod-network-cidr=10.244.0.0/16 assumes that there is a private network in your cluster to be used. This is how the nodes will communicate internally. Maybe here is the root of the cause: "....timed out waiting for the condition..."

One question may be if you need to publish the traefik dashboard via a ingress network. I think a different valid solution would be to just access the dashboard directly form the node/port it is running on.

It also take me a lot of time to get the dashboard up and running in a way I was satisfied. So to get rid of the downstream problem you should first make sure that the traefik core feature - the routing from outside to inside - is working. This why I recommanded first to verify if something like the 'whoami' service is working with ingress configuration. If this works than is should be possible to get the Dashboard up and running ;-)

stale[bot] commented 3 years ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.