influxdata / helm-charts

Official Helm Chart Repository for InfluxData Applications
MIT License
233 stars 329 forks source link

influxdb doesn't come up after a pod relocation and tries to restart every 60 seconds #656

Closed batulziiy closed 4 months ago

batulziiy commented 4 months ago

Hi,

I have a influx 1.8.10 running on kubernetes k3s and it has been running since last year until I relocated the pod due to hardware maintenance on Kubernetes node. After the relocation the influxdb pod didn't come up and what's bizarre is there's no error log. The pod log shows that database tries to start by opening files from /var/lib/influx/data/xxx, then the pod just get killed due to liveness probe. Tried increasing timeout value, but it doesn't help.

It was installed from helm chart and has been running normal until yesterday. Have you ever had the same experience with influx 1.8?

Thanks.

batulziiy commented 4 months ago

update : Found out why the container was failing to start. The container couldn't start in a given period of 60s, bootup process took more than 60s, then it just failed. I was able to start the container by changing "terminationGracePeriodSeconds" from 60s to 900s in statefulset. kubectl edit sts influx1-influx1-influxdb -n influx then update the value of 'terminationGracePeriodSeconds' to 900. Might require a reboot, or the pod can be simple deleted.