安装higress探针检测connection refused #1335

Open asd969704376 opened 2 days ago

asd969704376 commented 2 days ago

[root@master ~]# kubectl get pod --all-namespaces -o wide NAMESPACE NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES higress-system higress-console-7f7c8496c-fc4zs 0/1 Running 4 (46s ago) 4m34s node1 higress-system higress-controller-6dd78dd75f-cf654 0/2 Running 2 (33s ago) 4m34s node1 higress-system higress-gateway-767884599d-5m5rr 0/1 ContainerCreating 0 4m34s node1 higress-system higress-gateway-767884599d-8vg62 0/1 Pending 0 4m34s kube-flannel kube-flannel-ds-cdsq5 1/1 Running 0 21h master kube-flannel kube-flannel-ds-kcq6f 1/1 Running 0 21h node1 kube-system coredns-84df5b7799-8ps6r 1/1 Running 0 21h master kube-system coredns-84df5b7799-t95ff 1/1 Running 0 21h master kube-system etcd-master 1/1 Running 5 21h master kube-system kube-apiserver-master 1/1 Running 5 21h master kube-system kube-controller-manager-master 1/1 Running 1 21h master kube-system kube-proxy-bdfz5 1/1 Running 0 21h node1 kube-system kube-proxy-lb9p9 1/1 Running 0 21h master kube-system kube-scheduler-master 1/1 Running 10 21h master

[root@master ~]# kubectl describe pod -n higress-system higress-controller-6dd78dd75f-cf654 Name: higress-controller-6dd78dd75f-cf654 Namespace: higress-system Priority: 0 Service Account: higress-controller Node: node1/ Start Time: Tue, 24 Sep 2024 11:36:53 +0800 Labels: app=higress-controller higress=higress-controller pod-template-hash=6dd78dd75f Annotations: Status: Running IP: IPs: IP: Controlled By: ReplicaSet/higress-controller-6dd78dd75f Containers: higress-core: Container ID: containerd://24b52d175b8eac256fd291dbd493cf771c0d8f53c09714c749dc7277a0457316 Image: Image ID: Ports: 8888/TCP, 8889/TCP, 15051/TCP Host Ports: 0/TCP, 0/TCP, 0/TCP Args: serve --gatewaySelectorKey=higress --gatewaySelectorValue=higress-system-higress-gateway --gatewayHttpPort=80 --gatewayHttpsPort=443 --ingressClass=higress --enableAutomaticHttps=true --automaticHttpsEmail= State: Running Started: Tue, 24 Sep 2024 11:41:06 +0800 Last State: Terminated Reason: Error Exit Code: 1 Started: Tue, 24 Sep 2024 11:38:54 +0800 Finished: Tue, 24 Sep 2024 11:40:54 +0800 Ready: False Restart Count: 2 Limits: cpu: 1 memory: 2Gi Requests: cpu: 500m memory: 2Gi Readiness: http-get http://:8888/ready delay=1s timeout=5s period=3s #success=1 #failure=3 Environment: POD_NAME: higress-controller-6dd78dd75f-cf654 ( POD_NAMESPACE: higress-system (v1:metadata.namespace) SERVICE_ACCOUNT: (v1:spec.serviceAccountName) DOMAIN_SUFFIX: cluster.local Mounts: /var/log from log (rw) /var/run/secrets/ from kube-api-access-6xm88 (ro) discovery: Container ID: containerd://7540143771a5434e43f1ff78061519655bd267570e77d1aa0f774f8ee3dd3152 Image: Image ID: Ports: 8080/TCP, 15010/TCP, 15017/TCP Host Ports: 0/TCP, 0/TCP, 0/TCP Args: discovery --monitoringAddr=:15014 --log_output_level=default:info --domain cluster.local --keepaliveMaxServerConnectionAge 30m State: Running Started: Tue, 24 Sep 2024 11:36:54 +0800 Ready: False Restart Count: 0 Requests: cpu: 500m memory: 2Gi Readiness: http-get http://:8080/ready delay=1s timeout=5s period=3s #success=1 #failure=3 Environment: PILOT_ENABLE_HEADLESS_SERVICE_POD_LISTENERS: false HIGRESS_SYSTEM_NS: higress-system DEFAULT_UPSTREAM_CONCURRENCY_THRESHOLD: 10000 ISTIO_GPRC_MAXRECVMSGSIZE: 104857600 ENBALE_SCOPED_RDS: true ON_DEMAND_RDS: false HOST_RDS_MERGE_SUBSET: false PILOT_FILTER_GATEWAY_CLUSTER_CONFIG: true HIGRESS_CONTROLLER_SVC: HIGRESS_CONTROLLER_PORT: 15051 REVISION: default JWT_POLICY: third-party-jwt PILOT_CERT_PROVIDER: istiod POD_NAME: higress-controller-6dd78dd75f-cf654 ( POD_NAMESPACE: higress-system (v1:metadata.namespace) SERVICE_ACCOUNT: (v1:spec.serviceAccountName) KUBECONFIG: /var/run/secrets/remote/config PRIORITIZED_LEADER_ELECTION: false INJECT_ENABLED: false PILOT_ENABLE_CROSS_CLUSTER_WORKLOAD_ENTRY: false PILOT_ENABLE_METADATA_EXCHANGE: false PILOT_SCOPE_GATEWAY_TO_NAMESPACE: false VALIDATION_ENABLED: false PILOT_TRACE_SAMPLING: 1 PILOT_ENABLE_PROTOCOL_SNIFFING_FOR_OUTBOUND: true PILOT_ENABLE_PROTOCOL_SNIFFING_FOR_INBOUND: true ISTIOD_ADDR: istiod.higress-system.svc:15012 PILOT_ENABLE_ANALYSIS: false CLUSTER_ID: Kubernetes CUSTOM_CA_CERT_NAME: higress-ca-root-cert Mounts: /etc/cacerts from cacerts (ro) /etc/istio/config from config (rw) /var/run/secrets/istio-dns from local-certs (rw) /var/run/secrets/ from kube-api-access-6xm88 (ro) /var/run/secrets/remote from istio-kubeconfig (ro) /var/run/secrets/tokens from istio-token (ro) Conditions: Type Status PodReadyToStartContainers True Initialized True Ready False ContainersReady False PodScheduled True Volumes: log: Type: EmptyDir (a temporary directory that shares a pod's lifetime) Medium:
SizeLimit: config: Type: ConfigMap (a volume populated by a ConfigMap) Name: higress-config Optional: false local-certs: Type: EmptyDir (a temporary directory that shares a pod's lifetime) Medium: Memory SizeLimit: istio-token: Type: Projected (a volume that contains injected data from multiple sources) TokenExpirationSeconds: 43200 cacerts: Type: Secret (a volume populated by a Secret) SecretName: cacerts Optional: true istio-kubeconfig: Type: Secret (a volume populated by a Secret) SecretName: istio-kubeconfig Optional: true kube-api-access-6xm88: Type: Projected (a volume that contains injected data from multiple sources) TokenExpirationSeconds: 3607 ConfigMapName: kube-root-ca.crt ConfigMapOptional: DownwardAPI: true QoS Class: Burstable Node-Selectors: Tolerations: op=Exists for 300s op=Exists for 300s Events: Type Reason Age From Message

Normal Scheduled 6m3s default-scheduler Successfully assigned higress-system/higress-controller-6dd78dd75f-cf654 to node1 Normal Pulled 6m2s kubelet Container image "" already present on machine Normal Created 6m2s kubelet Created container higress-core Normal Started 6m2s kubelet Started container higress-core Normal Pulled 6m2s kubelet Container image "" already present on machine Normal Created 6m2s kubelet Created container discovery Normal Started 6m2s kubelet Started container discovery Warning Unhealthy 5m42s (x10 over 6m1s) kubelet Readiness probe failed: Get "": dial tcp connect: connection refused Warning Unhealthy 60s (x113 over 6m1s) kubelet Readiness probe failed: Get "": dial tcp connect: connection refused

请问下这是什么原因 是我的k8s搭建的有问题吗

johnlanni commented 2 days ago

有 pod 日志吗?

asd969704376 commented 1 day ago

johnlanni commented 1 day ago

看你的问题是 higress-controller pod 无法就绪吧,就看这个pod的日志,这个pod有两个容器,两个容器的日志都看一下有什么信息

asd969704376 commented 1 day ago

[root@master ~]# kubectl logs -n higress-system higress-controller-6dd78dd75f-dbgvt Defaulted container "higress-core" out of: higress-core, discovery 2024-09-25T03:42:00.580531Z info FLAG: --automaticHttpsEmail="" 2024-09-25T03:42:00.580572Z info FLAG: --certHttpAddress=":8889" 2024-09-25T03:42:00.580585Z info FLAG: --clusterAliases="[]" 2024-09-25T03:42:00.580592Z info FLAG: --clusterID="Kubernetes" 2024-09-25T03:42:00.580598Z info FLAG: --clusterRegistriesNamespace="" 2024-09-25T03:42:00.580605Z info FLAG: --debug="true" 2024-09-25T03:42:00.580611Z info FLAG: --domain="cluster.local" 2024-09-25T03:42:00.580624Z info FLAG: --enableAutomaticHttps="true" 2024-09-25T03:42:00.580630Z info FLAG: --enableStatus="true" 2024-09-25T03:42:00.580637Z info FLAG: --gatewayHttpPort="80" 2024-09-25T03:42:00.580643Z info FLAG: --gatewayHttpsPort="443" 2024-09-25T03:42:00.580649Z info FLAG: --gatewaySelectorKey="higress" 2024-09-25T03:42:00.580655Z info FLAG: --gatewaySelectorValue="higress-system-higress-gateway" 2024-09-25T03:42:00.580661Z info FLAG: --grpcAddress=":15051" 2024-09-25T03:42:00.580667Z info FLAG: --help="false" 2024-09-25T03:42:00.580672Z info FLAG: --httpAddress=":8888" 2024-09-25T03:42:00.580678Z info FLAG: --ingressClass="higress" 2024-09-25T03:42:00.580684Z info FLAG: --keepStaleWhenEmpty="false" 2024-09-25T03:42:00.580691Z info FLAG: --keepaliveInterval="30s" 2024-09-25T03:42:00.580698Z info FLAG: --keepaliveMaxServerConnectionAge="2562047h47m16.854775807s" 2024-09-25T03:42:00.580704Z info FLAG: --keepaliveTimeout="10s" 2024-09-25T03:42:00.580709Z info FLAG: --kubeconfig="" 2024-09-25T03:42:00.580716Z info FLAG: --kubernetesApiBurst="160" 2024-09-25T03:42:00.580725Z info FLAG: --kubernetesApiQPS="80" 2024-09-25T03:42:00.580730Z info FLAG: --log_as_json="false" 2024-09-25T03:42:00.580736Z info FLAG: --log_caller="" 2024-09-25T03:42:00.580741Z info FLAG: --log_output_level="default:info" 2024-09-25T03:42:00.580747Z info FLAG: --log_rotate="" 2024-09-25T03:42:00.580752Z info FLAG: --log_rotate_max_age="30" 2024-09-25T03:42:00.580758Z info FLAG: --log_rotate_max_backups="1000" 2024-09-25T03:42:00.580770Z info FLAG: --log_rotate_max_size="104857600" 2024-09-25T03:42:00.580776Z info FLAG: --log_stacktrace_level="default:none" 2024-09-25T03:42:00.580785Z info FLAG: --log_target="[stdout]" 2024-09-25T03:42:00.580791Z info FLAG: --resync="1m0s" 2024-09-25T03:42:00.580798Z info FLAG: --vklog="0" 2024-09-25T03:42:00.580803Z info FLAG: --watchNamespace="" 2024-09-25T03:42:30.584425Z info init xds server 打印了下就只有这些