siderolabs / talos

Talos Linux is a modern Linux distribution built for Kubernetes.
https://www.talos.dev
Mozilla Public License 2.0
6.85k stars 549 forks source link

Kube API server timeout on RPI install #9619

Open Ph4rell opened 1 week ago

Ph4rell commented 1 week ago

Bug Report

Description

I'm trying to run Talos on Raspberry Pi 4b but running out of idea, everything looks good on the dashboard except that i can't connect to the API server via kubectl/curl command.

Any hints ?

Pierre

Install config:

    install:
        disk: /dev/mmcblk0
        image: ghcr.io/siderolabs/installer:v1.7.6 
        wipe: false

Network config:

>     network:
>         interfaces:
>           - interface: eth0
>             addresses:
>               - 192.168.1.101/24
>             routes:
>               - network: 0.0.0.0/0
>                 gateway: 192.168.1.1
>             vip:
>               ip: 192.168.1.100
> 
>         nameservers:
>           - 192.168.1.50
>           - 8.8.8.8
> 
>         extraHostEntries:
>           - ip: 192.168.1.101
>             aliases:
>               - rpi-1
>           - ip: 192.168.1.102
>             aliases:
>               - rpi-2
>           - ip: 192.168.1.103
>             aliases:
>               - rpi-3

Logs

In the Dashboard: Capture d’écran 2024-10-31 à 23 35 15

warning: [2024-10-31T23:03:55.604968327Z]: [talos] kubernetes endpoint watch error {"component": "controller-runtime", "controller":
 "k8s.EndpointController", "error": "failed to list *v1.Endpoints: Get \"https://198.168.1.100:6443/api/v1/namespaces/default/endpoints?
 fieldSelector=metadata.name%3Dkubernetes&limit=500&resourceVersion=0\": dial tcp 198.168.1.100:6443: i/o timeout"}
talosctl health --talosconfig talosconfig --nodes 192.168.1.101                       took 26m43s
discovered nodes: ["192.168.1.100"]
waiting for etcd to be healthy: ...
waiting for etcd to be healthy: OK
waiting for etcd members to be consistent across nodes: ...
waiting for etcd members to be consistent across nodes: OK
waiting for etcd members to be control plane nodes: ...
waiting for etcd members to be control plane nodes: OK
waiting for apid to be ready: ...
waiting for apid to be ready: OK
waiting for all nodes memory sizes: ...
waiting for all nodes memory sizes: OK
waiting for all nodes disk sizes: ...
waiting for all nodes disk sizes: OK
waiting for kubelet to be healthy: ...
waiting for kubelet to be healthy: OK
waiting for all nodes to finish boot sequence: ...
waiting for all nodes to finish boot sequence: OK
waiting for all k8s nodes to report: ...
waiting for all k8s nodes to report: Get "https://198.168.1.100:6443/api/v1/nodes": context deadline exceeded
healthcheck error: rpc error: code = DeadlineExceeded desc = context deadline exceeded
curl https://192.168.1.100:6443/api -v -k
*   Trying 192.168.1.100:6443...
* Connected to 192.168.1.100 (192.168.1.100) port 6443
* ALPN: curl offers h2,http/1.1
* (304) (OUT), TLS handshake, Client hello (1):
* (304) (IN), TLS handshake, Server hello (2):
* (304) (IN), TLS handshake, Unknown (8):
* (304) (IN), TLS handshake, Request CERT (13):
* (304) (IN), TLS handshake, Certificate (11):
* (304) (IN), TLS handshake, CERT verify (15):
* (304) (IN), TLS handshake, Finished (20):
* (304) (OUT), TLS handshake, Certificate (11):
* (304) (OUT), TLS handshake, Finished (20):
* SSL connection using TLSv1.3 / AEAD-CHACHA20-POLY1305-SHA256 / [blank] / UNDEF
* ALPN: server accepted h2
* Server certificate:
*  subject: O=kube-master; CN=kube-apiserver
*  start date: Oct 31 22:29:30 2024 GMT
*  expire date: Oct 31 22:29:30 2025 GMT
*  issuer: O=kubernetes
*  SSL certificate verify result: unable to get local issuer certificate (20), continuing anyway.
* using HTTP/2
* [HTTP/2] [1] OPENED stream for https://192.168.1.100:6443/api
* [HTTP/2] [1] [:method: GET]
* [HTTP/2] [1] [:scheme: https]
* [HTTP/2] [1] [:authority: 192.168.1.100:6443]
* [HTTP/2] [1] [:path: /api]
* [HTTP/2] [1] [user-agent: curl/8.7.1]
* [HTTP/2] [1] [accept: */*]
> GET /api HTTP/2
> Host: 192.168.1.100:6443
> User-Agent: curl/8.7.1
> Accept: */*
>
* Request completely sent off
* received GOAWAY, error=0, last_stream=1
< HTTP/2 401
< audit-id: 97e54f26-d8ab-43ca-99b8-9cff0c091719
< cache-control: no-cache, private
< content-type: application/json
< content-length: 157
< date: Thu, 31 Oct 2024 23:09:45 GMT
<
{
  "kind": "Status",
  "apiVersion": "v1",
  "metadata": {},
  "status": "Failure",
  "message": "Unauthorized",
  "reason": "Unauthorized",
  "code": 401
* Closing connection
}

Environment

smira commented 1 week ago

Please attach talosctl support bundle.

Ph4rell commented 1 week ago

support.zip

smira commented 1 week ago

I looked all over the support bundle, but I don't see why you would get an i/o timeout. The IP address is assigned to the host, and kube-apiserver is up.

Ph4rell commented 18 hours ago

Damn it :), thanks for having a look @smira !