Closed edwardzjl closed 3 years ago
Try kubectl logs <pod-name> -n kubegres-system --previous
--previous
will show you logs of the previous instantiation of a container
Then try kubectl describe <pod-name> -n kubegres-system
check "state reason","last state reason" and "Events"
I observed an "OOMKilled" status just before the pod become "CrashLoopBackOff". I expand the resources limit to 64Mi of memory (used to be 30Mi) and the controller seems to work properly now. So why the default config does not work on my cluster? Should we just expand it to a larger number?
Thank you for highlighting the memory issue that you are having with the controller. I am going to use this issue to investigate why this is happening and eventually increase the memory limit as suggested. We have not had any issue in prod so not sure why you are having this issue.
@alex-arica Thanks for your quick reply. I will keep 64mb limit for my cluster as it's working and looking forward to any updates.
I am pleased to announce the release of Kubegres 1.10 which includes the increase of Kubegres controller's memory limit from 30Mi to 60Mi. https://github.com/reactive-tech/kubegres/releases/tag/v1.10
Thanks to @edwardzjl for suggesting this feature.
To install Kubegres 1.10, please run:
kubectl apply -f https://raw.githubusercontent.com/reactive-tech/kubegres/v1.10/kubegres.yaml
Thanks for fixing this issue. We reproduced this issue in a on-premise cluster (10 nodes) but not on cloud cluster (5 node). The cloud cluster has very small CPU/RAM in total compared to on-premise cluster. I guess the number of nodes has a role in the crash? We could not get any info from logs of the manager too. Anyway, we will just upgrade now.
Thank you for your message. We have not identified the root cause of this issue. I could not reproduce it in our clusters.
Please let me know if after the upgrade you still encounter this issue.
I'm new to kubegres, so this might probably be a miss-configuration or sth.
After I setup a cluster following the getting-started tutorial, the
kubegres-controller-manager
pod keeps restarting, while the postgres instance pods seems running ok:I checked the log of two containers running inside
kubegres-controller-manager
pod, neither of them contains error:manager
container:kube-rbac-proxy
container:But the postgres cluster and service are all running well, I can access the database with no problem. (I port-forward the postgres service to access the db)
I'm not sure what's going on, please help me get the
kubegres-controller-manager
working