Closed rafaelarcanjo closed 1 week ago
This issue is currently awaiting triage.
If Ingress contributors determines this is a relevant issue, they will accept it by applying the triage/accepted
label and provide further guidance.
The triage/accepted
label can be added by org members by writing /triage accepted
in a comment.
We have made several changes in defaults, with security defaults being set to true that were false.
Can you post the helm chart values you used for the upgrade, there were also major changes to the helm chart.
1.8 to 1.11 is a major jump; I would go back to release notes and see if there are major differences that would break an RKE cluster. Also try upgrades of one minor release at a time.
I would check to make sure all ports are open for the controller; one for the health check is 10254.
/remove-kind bug /kind support /triage needs-information
Same here. I have updated from 1.10.0. I'm deploying with these Helm values:
controller:
kind: Deployment
replicaCount: 3
revisionHistoryLimit: 3
resources:
requests:
cpu: 100m
memory: 500Mi
limits:
cpu: 500m
memory: 500Mi
allowSnippetAnnotations: true
config:
proxy-body-size: "10m"
service:
externalTrafficPolicy: Local
ingressClass: nginx
minAvailable: 2
affinity:
podAntiAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
matchLabels:
app.kubernetes.io/component: controller
app.kubernetes.io/name: ingress-nginx
topologyKey: kubernetes.io/hostname
For this error ;
Readiness probe failed: HTTP probe failed with statuscode: 500
please post proof that all requirements are met like port 10254 is open etc etc.
For other errors, please post the details that someone can analyze. The required details are asked in the template of a new bug report.
I see this, too: Liveness probe failed: Get "http://10.24.9.43:10254/healthz": context deadline exceeded (Client.Timeout exceeded while awaiting headers)
maybe just not enough CPU TO respond in time...?
The controller does not have any RKE specific code so the error message has to be used to hunt for root cause. I suspected that the port for healthcheck (port number 10254) is not open between the nodes. Also as @mruoss asked, just the right amount of starvation of cpu/mem/network for a related process at the observed timestamp could also be root cause.
Plz check error message and accordingly hunt down the details of status from logs etc.
Once you have a action item for the project like a series of details step by step instructions to reproduce the problem at will, please re-open the issue. I will close the issue for now as it is adding to the tally of open issues, without tracking any action item on anyone.
/close
@longwuyuan: Closing this issue.
Hello,
I have the ingress in version 1.8.4 working perfectly, with CVE-2024-7646 I would like to update to the lastest version, however it is not working.
My environment is an RKE in version v1.26.11+rke2r1 running on baremetal, I used the specific deployment for baremetal, but the pods does not go online.
I rolled back to version 1.8.4 and it's running without problems.
Could you help?
Thank you.