aibangjuxin / knowledge

My knowledge
1 stars 0 forks source link

GKE ingress #136

Open aibangjuxin opened 6 months ago

aibangjuxin commented 6 months ago

在 GCP 工程中安装 Internal Ingress 暴露 GKE 负载均衡器

以下是安装 Internal Ingress 暴露 GKE 负载均衡器的详细步骤和操作流程:

一、准备工作

  1. 确保您已拥有一个 GCP 项目和一个 GKE 集群。
  2. 安装 kubectl 命令行工具。
  3. 确保您拥有 Kubernetes 集群的 kubeconfig 文件。

二、安装 Internal Ingress 控制器

  1. 选择一个 Internal Ingress 控制器,例如 NGINX Ingress Controller 或 Istio Ingress Gateway。
  2. 根据您选择的控制器,按照官方文档进行安装。

三、创建 Ingress 资源

  1. 创建一个 Ingress 资源,定义您要暴露的 GKE 服务和域名。
  2. 在 Ingress 资源中,指定以下内容:
    • spec.rules: 定义域名和路径规则。
    • spec.backend: 定义要暴露的 GKE 服务。
    • annotations: 添加必要的注释,例如 kubernetes.io/ingress.class 用于指定 Ingress 控制器。

四、创建 Service 资源

  1. 创建一个 Service 资源,定义您要暴露的 GKE 服务。
  2. 在 Service 资源中,指定以下内容:
    • spec.type: 指定服务类型,例如 LoadBalancerClusterIP
    • spec.selector: 指定要暴露的 Pod 选择器。

五、测试 Ingress

  1. 使用 kubectl get ingress 命令查看 Ingress 资源状态。
  2. 使用 curl 命令测试 Ingress 访问。

操作流程

  1. 安装 Internal Ingress 控制器。
  2. 创建 Ingress 资源。
  3. 创建 Service 资源。
  4. 测试 Ingress。

注意

参考文档

示例

以下是一个示例 Ingress 资源,用于暴露名为 my-service 的 GKE 服务:

apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: my-ingress
spec:
  rules:
  - host: my-domain.com
    http:
      paths:
      - path: /
        pathType: Prefix
        backend:
          service:
            name: my-service
            port:
              number: 80
  annotations:
    kubernetes.io/ingress.class: nginx

请根据您的实际情况修改该示例。

Here is a step-by-step guide to installing an internal ingress in a GCP project to expose a load balancer in a GKE cluster:

Step 1: Create a GKE cluster

Step 2: Create a load balancer in GKE

Step 3: Create a internal ingress resource

Step 4: Create a firewall rule to allow traffic to the internal ingress

Step 5: Configure the internal ingress to use the load balancer

Step 6: Test the internal ingress

Operation Flow:

Here is a high-level overview of the operation flow:

  1. Create a GKE cluster and deploy a load balancer service
  2. Create an internal ingress resource to expose the load balancer service
  3. Create a firewall rule to allow traffic to the internal ingress
  4. Configure the internal ingress to use the load balancer service
  5. Test the internal ingress to verify that it is working correctly

Note: This is a high-level guide and you may need to modify the steps and configurations based on your specific use case and requirements.

aibangjuxin commented 6 months ago

对于您遇到的GKE Ingress后端服务不健康(unhealthy)以及健康检查(health check)返回404的问题,可以按照以下步骤进行排查:

  1. 检查Ingress配置

    • 确保您的Ingress资源配置正确。检查是否正确指定了后端服务和端口。
  2. 检查服务和服务端口

    • 确认后端服务已经正确创建,并且服务选择器与您的Pod标签匹配。
    • 确认服务定义的端口与您的应用监听的端口一致。
  3. 检查Pods状态

    • 确认相关的Pods正在运行并且状态为Running,没有处于ErrorCrashLoopBackOff状态。
  4. 检查Pod日志

    • 查看Pods的日志,以确定是否有任何启动错误或运行时错误。
  5. 检查健康检查配置

    • 如果您配置了健康检查,确保健康检查的路径和端口设置正确。
  6. 检查Ingress的访问日志

    • 通过设置Ingress的访问日志,可以查看请求的详细信息,包括请求路径、状态码等。
  7. 检查网络设置

    • 确认Ingress的网络设置没有问题,比如防火墙规则、网络策略等。
  8. 验证服务发现

    • 确保服务发现机制(如DNS)能够正确解析您的服务名称。
  9. 检查Istio配置(如果使用):

    • 如果您使用Istio作为服务网格,检查Istio的路由规则和虚拟服务配置。
  10. 测试本地访问

    • 尝试从集群内的Pod直接访问后端服务,看是否能够成功连接。
  11. 检查资源配额和限制

    • 确认您的资源配额和限制没有阻止服务正常运行。
  12. 查看GKE和Ingress的官方文档

    • 参考GKE和Ingress的官方文档,以获取更多关于配置和故障排除的信息。

如果上述步骤都不能解决问题,您可能需要查看GKE的日志和监控信息,或者联系Google Cloud的技术支持以获得进一步的帮助。

由于您提供的链接无法解析,我无法提供该链接的具体内容,但上述步骤是通用的排查方法,应该对您有所帮助。如果您需要更详细的帮助,建议检查您提供的链接是否正确,或者提供更多的上下文信息。

aibangjuxin commented 6 months ago

针对您遇到的GKE Ingress后端服务显示为不健康(unhealthy)以及健康检查(health check)返回404状态码的问题,您可以按照以下步骤进行排查:

  1. 检查服务响应:确保您的服务能够正确响应健康检查请求。GKE Ingress的健康检查默认会向服务的/路径发送GET请求,并期望返回HTTP 200状态码。如果您的服务在根路径/上没有提供响应或者返回的不是200状态码,那么健康检查将会失败。您可以通过修改服务以在/路径上返回200状态码,或者通过配置readinessProbe来指定健康检查的路径和响应要求[3][4]。

  2. 配置Readiness Probe:如果您的服务不在/路径上提供健康检查响应,您可以通过配置Readiness Probe来指定一个不同的路径和期望的响应。Readiness Probe的配置将会被GKE Ingress的健康检查继承,从而允许您自定义健康检查的路径和期望的响应状态码。例如,您可以设置Readiness Probe以在/health路径上期望HTTP 200响应[4][5]。

  3. 检查防火墙规则:确保您的Google Cloud项目中的防火墙规则允许来自Google Cloud Load Balancer的健康检查流量。如果防火墙规则阻止了这些流量,健康检查将无法到达您的服务,导致服务被标记为不健康。您可能需要手动添加或修改防火墙规则,以允许来自Google Cloud Load Balancer IP范围的流量[5]。

  4. 检查Ingress和服务配置:确保您的Ingress和服务配置正确。Ingress需要正确地指向您的服务,并且服务需要配置为指向正确的Pod。如果有配置错误,Ingress可能无法正确地将流量路由到您的服务,导致健康检查失败[1][2][3]。

  5. 查看日志和事件:检查相关Pod、服务和Ingress的日志和事件,以寻找可能的错误或警告信息。这些信息可以帮助您识别配置问题或其他可能导致健康检查失败的原因[2][6][7]。

  6. 使用GCP Console或gcloud CLI工具:您可以使用Google Cloud Console或gcloud命令行工具来检查Ingress和后端服务的状态,以及健康检查的具体失败原因。这些工具可以提供更详细的诊断信息,帮助您更快地定位问题[6][7]。

通过上述步骤,您应该能够诊断并解决GKE Ingress后端服务显示为不健康以及健康检查返回404状态码的问题。如果问题仍然存在,您可能需要考虑查看Google Cloud的官方文档或在相关社区和论坛中寻求帮助。

Sources [1] Ingress for internal Application Load Balancers | Google Kubernetes Engine (GKE) | Google Cloud https://cloud.google.com/kubernetes-engine/docs/concepts/ingress-ilb [2] GKE ingress stuck in creating after deploying ingress 1.11.5.gke.5 https://github.com/kubernetes/ingress-gce/issues/605 [3] Health check endpoint for GKE ingress · Issue #21 · fluxcd/flux-recv https://github.com/fluxcd/flux-recv/issues/21 [4] readinessProbe w/http-admin does not create valid heath check on ... https://github.com/ory/k8s/issues/113 [5] Ingress on GKE creates a HTTP Loadbalancer, but instances are ... https://github.com/kubernetes/kubernetes/issues/20555 [6] Troubleshooting Tasks for Ingress-GCE in GKE : r/kubernetes - Reddit https://www.reddit.com/r/kubernetes/comments/17g1hp3/troubleshooting_tasks_for_ingressgce_in_gke/ [7] GKE Ingress: "All backend services are in UNHEALTHY state" https://groups.google.com/g/google-cloud-dev/c/tIXrbsPoAFw [8] kubernetes unhealthy ingress backend - Stack Overflow https://stackoverflow.com/questions/39294305/kubernetes-unhealthy-ingress-backend [9] GKE ingress unable to connect to healthy service - Server Fault https://serverfault.com/questions/921954/gke-ingress-unable-to-connect-to-healthy-service [10] GKE Ingress for Application Load Balancers - Google Cloud https://cloud.google.com/kubernetes-engine/docs/concepts/ingress [11] unable to create https backend and https healthcheck on gke https://www.googlecloudcommunity.com/gc/Google-Kubernetes-Engine-GKE/unable-to-create-https-backend-and-https-healthcheck-on-gke/m-p/610176 [12] 404s for service endpoints when using ingress with Gcloud https://serverfault.com/questions/924278/404s-for-service-endpoints-when-using-ingress-with-gcloud [13] Troubleshoot load balancing in GKE - Google Cloud https://cloud.google.com/kubernetes-engine/docs/troubleshooting/load-balancing [14] Why GKE Ingress controller gives 404 error - Stack Overflow https://stackoverflow.com/questions/62296942/why-gke-ingress-controller-gives-404-error [15] Ingress configuration on Google Cloud | Google Kubernetes Engine ... https://cloud.google.com/kubernetes-engine/docs/how-to/ingress-configuration [16] GKE ingress resources are not working. getting 404 error https://stackoverflow.com/questions/63436938/gke-ingress-resources-are-not-working-getting-404-error [17] Ingress paths not working with dynamic endpoint - Google Groups https://groups.google.com/g/kubernetes-users/c/Phy3R2V-efw [18] Troubleshooting | Google Kubernetes Engine (GKE) https://cloud.google.com/kubernetes-engine/docs/troubleshooting [19] Troubleshooting Deployments on GKE - Kubeflow https://v0-3.kubeflow.org/docs/guides/gke/troubleshooting-gke/ [20] All backend services are in UNHEALTHY in gke ingress https://www.googlecloudcommunity.com/gc/Serverless/All-backend-services-are-in-UNHEALTHY-in-gke-ingress/m-p/709716

aibangjuxin commented 6 months ago

Troubleshooting unhealthy backend services in GKE Ingress involves a combination of understanding the underlying architecture, configuration, and potential issues. Here's a step-by-step approach to help you identify and resolve the problem:

  1. Check the backend service configuration:

    • Verify that the backend service is correctly configured and running. Ensure that the service is exposed and accessible from the Ingress.
    • Check the service's readiness and liveness probes to ensure they are correctly configured and functioning as expected[1][3][4].
  2. Inspect the Ingress configuration:

    • Review the Ingress configuration to ensure it is correctly pointing to the backend service. Verify that the Ingress is correctly configured to route traffic to the backend service.
    • Check the Ingress's health checks to ensure they are correctly configured and functioning as expected[1][3][4].
  3. Check the Load Balancer configuration:

    • Verify that the Load Balancer is correctly configured and functioning as expected. Ensure that the Load Balancer is correctly routing traffic to the backend service.
    • Check the Load Balancer's health checks to ensure they are correctly configured and functioning as expected[1][3][4].
  4. Check the firewall rules:

    • Ensure that the firewall rules allow traffic from the Load Balancer to the backend service. If the rules are blocking traffic, the backend service will be marked as unhealthy[1][5].
  5. Check the GKE cluster and node configuration:

    • Verify that the GKE cluster and nodes are correctly configured and functioning as expected. Ensure that the nodes are correctly configured to run the backend service.
    • Check the cluster's and node's logs for any errors or issues that might be related to the backend service's health[1][2][3][4].
  6. Check the Ingress and backend service logs:

    • Review the logs for the Ingress and backend service to identify any errors or issues that might be related to the backend service's health.
    • Check the logs for any errors or issues related to the health checks, readiness probes, and liveness probes[1][2][3][4].
  7. Check the GCP documentation and community resources:

    • Review the official GCP documentation and community resources for any known issues or limitations related to GKE Ingress and backend services.
    • Check the GitHub issues and Stack Overflow for any similar issues or solutions that might be applicable to your situation[1][2][3][4][5][6][7].

By following these steps, you should be able to identify and resolve the issue causing the backend service to be marked as unhealthy in your GKE Ingress.

Sources [1] Ingress on GKE creates a HTTP Loadbalancer, but instances are ... https://github.com/kubernetes/kubernetes/issues/20555 [2] Unable to avoid unhealthy backend / 502s on rolling deployments https://github.com/kubernetes/ingress-gce/issues/1718 [3] [GKE] Ingress does not connect to NodePort Service #45438 - GitHub https://github.com/kubernetes/kubernetes/issues/45438 [4] GKE Ingress shows unhealthy backend services - Stack Overflow https://stackoverflow.com/questions/63268552/gke-ingress-shows-unhealthy-backend-services [5] GKE Ingress: "All backend services are in UNHEALTHY state" https://groups.google.com/g/google-cloud-dev/c/tIXrbsPoAFw [6] GKE Ingress: How to fix a 502 bad gateway error - Willian Antunes https://www.willianantunes.com/blog/2021/05/gke-ingress-how-to-fix-a-502-bad-gateway-error/ [7] GKE Ingress health check failed on ingress but succeed on ... https://stackoverflow.com/questions/71500696/gke-ingress-health-check-failed-on-ingress-but-succeed-on-loadbalncer [8] GKE ingress unable to connect to healthy service - Server Fault https://serverfault.com/questions/921954/gke-ingress-unable-to-connect-to-healthy-service [9] Streamlit 404 error in Google Cloud Run · Issue #484 - GitHub https://github.com/streamlit/streamlit/issues/484 [10] GCE Health Check settings are reverting to default - Server Fault https://serverfault.com/questions/916748/gce-health-check-settings-are-reverting-to-default [11] NGINX Ingress controller troubleshooting - Container Service for ... https://www.alibabacloud.com/help/en/ack/ack-managed-and-ack-dedicated/user-guide/nginx-ingress-controller-troubleshooting [12] Troubleshoot load balancing in GKE - Google Cloud https://cloud.google.com/kubernetes-engine/docs/troubleshooting/load-balancing [13] How to troubleshoot unhealthy backends in Google Cloud Load ... https://www.youtube.com/watch?v=f-VIZ8ALVD4 [14] Ingress paths not working with dynamic endpoint - Google Groups https://groups.google.com/g/kubernetes-users/c/Phy3R2V-efw [15] All backend services are in UNHEALTHY in gke ingress https://www.googlecloudcommunity.com/gc/Serverless/All-backend-services-are-in-UNHEALTHY-in-gke-ingress/m-p/709716 [16] Struggling to figure out why the ingress health check is failing https://www.googlecloudcommunity.com/gc/Google-Kubernetes-Engine-GKE/Struggling-to-figure-out-why-the-ingress-health-check-is-failing/td-p/620669/jump-to/first-unread-message

aibangjuxin commented 6 months ago

Ingress 中配置 host 时,需要确保该域名在工程中能够正常解析。这个域名需要在 DNS 中被正确地映射到 Ingress 的 IP 地址或 Load Balancer 的 IP 地址。否则,访问该域名将会失败。

在配置 Ingress 时,您需要确保该域名在 DNS 中被正确地映射到 Ingress 的 IP 地址或 Load Balancer 的 IP 地址。例如,如果您在 Ingress 中配置了 hello-ingress 作为 host,需要在 DNS 中将 hello-ingress 映射到 Ingress 的 IP 地址或 Load Balancer 的 IP 地址。

如果您使用的是 GKE Ingress,可以使用 GKE 的 Load Balancer 来自动地映射域名到 Ingress 的 IP 地址。GKE Load Balancer 可以根据域名自动地分配 IP 地址,并将流量路由到相应的 Ingress。

在配置 Ingress 时,您还需要确保该域名在工程中能够正常解析。例如,如果您在 Ingress 中配置了 hello-ingress 作为 host,需要在工程中添加相应的 DNS 记录,以便将 hello-ingress 映射到 Ingress 的 IP 地址或 Load Balancer 的 IP 地址。

总之,在配置 Ingress 时,您需要确保该域名在工程中能够正常解析,并将该域名映射到 Ingress 的 IP 地址或 Load Balancer 的 IP 地址。

Sources [1] GKE 最佳实践:通过Ingress 和Service 公开GKE 应用程序 - InfoQ https://www.infoq.cn/article/llztaium8cizgpq9z3rv [2] アプリケーション ロードバランサ用の GKE Ingress - Google Cloud https://cloud.google.com/kubernetes-engine/docs/concepts/ingress?hl=ja [3] KubernetesでのService公開方法に関する検証 - Ingress Controllerの ... https://developers.freee.co.jp/entry/kubernetes-ingress-controller [4] Nginx Ingress异常问题排查 - 阿里云文档 https://help.aliyun.com/zh/ack/ack-managed-and-ack-dedicated/support/troubleshooting-nginx-ingress-exceptions [5] 使用Ingress 设置外部应用负载均衡器| Kubernetes Engine https://cloud.google.com/kubernetes-engine/docs/tutorials/http-balancer?hl=zh-cn [6] kubernetes 【组件】ingress controller 如何通过域名访问您的应用原创 https://blog.csdn.net/xixihahalelehehe/article/details/112831123 [7] Ingress - Kubernetes https://kubernetes.io/ja/docs/concepts/services-networking/ingress/ [8] 入力方向 (Ingress) - VMware Docs https://docs.vmware.com/jp/VMware-NSX-Container-Plugin/4.1/ncp-kubernetes/GUID-E03D6EE5-9C6C-457F-AD81-25CF2056F4D8.html [9] 7.8. Ingress Controller の設定 OpenShift Container Platform 4.12 https://access.redhat.com/documentation/ja-jp/openshift_container_platform/4.12/html/networking/configuring-ingress-controller [10] NGINX Ingress Controller 助力实现多集群DNS 自动化 https://www.nginx-cn.net/blog/automating-multi-cluster-dns-with-nginx-ingress-controller/ [11] Nginx Ingress高级用法 - 阿里云文档 https://help.aliyun.com/zh/ack/ack-managed-and-ack-dedicated/user-guide/advanced-nginx-ingress-configurations [12] Pod 外部无法访问?可能是Ingress 配置有问题 - 稀土掘金 https://juejin.cn/post/7243413934766538811 [13] GKE で Ingress を使用して Google マネージド SSL 証明書を使用 ... https://blog.g-gen.co.jp/entry/gke-ingress-using-google-managed-ssl-cert [14] K8s network之二:Kubernetes的域名解析、服务发现和外部访问 https://marcuseddie.github.io/2021/K8s-Network-Architecture-section-two.html [15] 静的 IP アドレスを使用してドメイン名を構成する | Kubernetes Engine https://cloud.google.com/kubernetes-engine/docs/tutorials/configuring-domain-name-static-ip?hl=ja [16] 6.3. Ingress コントローラー設定パラメーター OpenShift Container ... https://access.redhat.com/documentation/ja-jp/openshift_container_platform/4.8/html/networking/nw-ingress-controller-configuration-parameters_configuring-ingress [17] IBM提供の Ingress コンポーネント - IBM Cloud Docs https://cloud.ibm.com/docs/containers?locale=ja&topic=containers-managed-ingress-about [18] gke和自动创建的启用http路由的域名 - 腾讯云 https://cloud.tencent.com/developer/ask/sof/1177177 [19] Runtime Fabric のイングレスの設定 | MuleSoft ドキュメント https://docs.mulesoft.com/jp/runtime-fabric/2.5/custom-ingress-configuration [20] Ingress のデバッグ - IBM Cloud Docs https://cloud.ibm.com/docs/openshift?locale=ja&topic=openshift-ingress-debug-roks4

aibangjuxin commented 6 months ago

You've set up a GKE Ingress with an Internal Load Balancer (ILB) following the documentation, but now you're experiencing issues with your backend services being unhealthy and receiving a 404 error on the health che troubleshoot this issue, let's break it down step by step:

1. Verify the health check configuration

Check the health check configuration for your backend services. Make sure the health check path, port, and protocol are correctly set up. You can do this by running the following command:

gcloud compute health-checks describe <health-check-name>

Replace <health-check-name> with the name of your health check. Verify that the health check is configured to check the correct path, port, and protocol.

2. Check the backend service configuration

Verify that the backend service is correctly configured and points to the correct instance group or network endpoint group. Run the following command:

gcloud compute backend-services describe <backend-service-name>

Replace <backend-service-name> with the name of your backend service. Check that the backend service is pointing to the correct instance group or network endpoint group.

3. Check the instance group or network endpoint group configuration

Verify that the instance group or network endpoint group is correctly configured and has the correct instances or endpoints. Run the following command:

gcloud compute instance-groups describe <instance-group-name>

or

gcloud compute network-endpoint-groups describe <network-endpoint-group-name>

Replace <instance-group-name> or <network-endpoint-group-name> with the name of your instance group or network endpoint group. Check that the instances or endpoints are correctly configured and running.

4. Check the firewall rules

Verify that the firewall rules are correctly configured to allow traffic from the health check service to reach your backend instances. Run the following command:

gcloud compute firewall-rules list

Check that there are no firewall rules blocking traffic from the health check service to your backend instances.

5. Check the logs

Check the logs for your backend instances to see if there are any errors or issues that might be causing the health check to fail. You can do this by running the following command:

gcloud compute instances get-serial-port-output <instance-name> --zone <zone>

Replace <instance-name> with the name of your backend instance, and <zone> with the zone where the instance is running.

6. Test the health check manually

Try testing the health check manually by sending a request to the health check path using a tool like curl or wget. This can help you determine if the issue is with the health check configuration or with the backend instance.

If none of these steps help you identify the issue, please provide more details about your setup, including the output of the commands above, and I'll do my best to help you troubleshoot the problem! 😊

aibangjuxin commented 6 months ago

对于您遇到的 GKE Ingress 后端服务不健康和健康检查返回 404 的问题,我可以给出一些排查建议:

  1. 检查 Ingress 资源配置

    • 确保 Ingress 资源中指定的后端服务的端口和路径配置正确
    • 检查 Ingress 注解是否配置正确,比如 kubernetes.io/ingress.allow-http: "false" 等
  2. 检查后端服务配置

    • 确保后端服务(Deployment/StatefulSet等)的 readinessProbe 和 livenessProbe 配置正确
    • 查看后端服务的 Pod 日志,看是否有错误信息
    • 确保后端应用程序监听的端口与服务端口一致
  3. 验证负载均衡器配置

    • 查看后端服务关联的 Endpoints 资源,确保其包含正确的 Pod IP 和端口
    • 尝试从另一个 Pod 访问后端服务,测试是否可达
  4. 检查防火墙和网络策略

    • 检查集群网络策略,确保不存在阻止健康检查流量的规则
    • 检查 GCP 防火墙规则,确保允许健康检查流量通过
  5. 重建相关资源

    • 尝试删除并重新创建 Ingress 资源
    • 重建后端服务(Deployment等),强制重新调度 Pod
  6. 查看 GKE Ingress 日志和元数据

    • 通过 Stackdriver 或 GKE Logs查看 Ingress 控制器日志
    • 查看 Ingress 控制器 StatusDescriptors 和元数据获取更多信息

如果上述步骤无法解决问题,可以考虑在issue tracker上提出,或者寻求GCP支持人员的帮助。​​​​​​​​​​​​​​​​