Closed UlliBe closed 5 years ago
@torgon could you help with a screenshot of your Cluster Profile and Elastic Profile you have configured?
Any environment details would also be helpful. The helm chart is deployed in minikube, EKS, GCP etc?
Sure Elastic agent profile https://gyazo.com/80ff8cef49f87f82e408c1f0e114b702 Deployed on a DigitalOcean kubernetes cluster v1.15.3
@torgon I tried helm install of the latest GoCD helm chart on a minikube cluster with
minikube version: v1.4.0
commit: 7969c25a98a018b94ea87d949350f3271e9d64b6
Kubectl version
Client Version: version.Info{Major:"1", Minor:"16", GitVersion:"v1.16.1", GitCommit:"d647ddbd755faf07169599a625faf302ffc34458", GitTreeState:"clean", BuildDate:"2019-10-02T23:49:20Z", GoVersion:"go1.12.9", Compiler:"gc", Platform:"darwin/amd64"}
Server Version: version.Info{Major:"1", Minor:"15", GitVersion:"v1.15.3", GitCommit:"2d3c76f9091b6bec110a5e63777c332469e0cba2", GitTreeState:"clean", BuildDate:"2019-08-19T11:05:50Z", GoVersion:"go1.12.9", Compiler:"gc", Platform:"linux/amd64"}
I could see the status report
2019-10-15 10:35:38,140 INFO [qtp1152224728-36] KubernetesPlugin:72 - [refresh-pod-state] Pod information successfully synced. All(Running/Pending) pod count is 0.
2019-10-15 10:35:38,145 INFO [qtp1152224728-36] KubernetesPlugin:72 - [status-report] Generating status report.
2019-10-15 10:35:38,153 INFO [qtp1152224728-36] KubernetesPlugin:72 - Running kubernetes nodes 1
2019-10-15 10:35:38,158 INFO [qtp1152224728-36] KubernetesPlugin:72 - Running pods 0
What happens when you do the following for your DigitalOcean kubernetes cluster?
kubectl get nodes
#pick a node where your GoCD server is deployed and run
kubectl describe nodes <name-of-the-node>
@torgon also have you made any changes to GoCD helm chart values file? If yes can you let me know what are the changes, might help in replicating this
Name: pool-prod-01-wo4l
Roles:
NetworkUnavailable False Tue, 01 Oct 2019 08:43:14 +0000 Tue, 01 Oct 2019 08:43:14 +0000 CiliumIsUp Cilium is running on this node MemoryPressure False Tue, 15 Oct 2019 11:09:34 +0000 Tue, 01 Oct 2019 08:42:59 +0000 KubeletHasSufficientMemory kubelet has sufficient memory available DiskPressure False Tue, 15 Oct 2019 11:09:34 +0000 Tue, 01 Oct 2019 08:42:59 +0000 KubeletHasNoDiskPressure kubelet has no disk pressure PIDPressure False Tue, 15 Oct 2019 11:09:34 +0000 Tue, 01 Oct 2019 08:42:59 +0000 KubeletHasSufficientPID kubelet has sufficient PID available Ready True Tue, 15 Oct 2019 11:09:34 +0000 Tue, 01 Oct 2019 08:43:09 +0000 KubeletReady kubelet is posting ready status Addresses: Hostname: pool-prod-01-wo4l InternalIP: 10.135.233.58 ExternalIP: 206.81.21.21 Capacity: attachable-volumes-csi-dobs.csi.digitalocean.com: 7 cpu: 4 ephemeral-storage: 165105408Ki hugepages-1Gi: 0 hugepages-2Mi: 0 memory: 8170040Ki pods: 110 Allocatable: attachable-volumes-csi-dobs.csi.digitalocean.com: 7 cpu: 4 ephemeral-storage: 165105408Ki hugepages-1Gi: 0 hugepages-2Mi: 0 memory: 6694Mi pods: 110 System Info: Machine ID: 96dc6ede0ebd41bca73d4fcc38e016e3 System UUID: 96dc6ede-0ebd-41bc-a73d-4fcc38e016e3 Boot ID: cb256851-77b1-4780-b608-5b2d90200d5c Kernel Version: 4.19.0-0.bpo.5-amd64 OS Image: Debian GNU/Linux 9 (stretch) Operating System: linux Architecture: amd64 Container Runtime Version: docker://18.9.2 Kubelet Version: v1.15.3 Kube-Proxy Version: v1.15.3 PodCIDR: 10.244.2.0/24 ProviderID: digitalocean://161214164 Non-terminated Pods: (27 in total) Namespace Name CPU Requests CPU Limits Memory Requests Memory Limits AGE
... skipped list of running pods
gocd gocd-server-5bdd76689f-zwpbt 0 (0%) 0 (0%) 0 (0%) 0 (0%) 18h
... skipped list of running pods
Allocated resources: (Total limits may be over 100 percent, i.e., overcommitted.) Resource Requests Limits
cpu 727m (18%) 1402m (35%)
memory 1287Mi (19%) 570Mi (8%)
ephemeral-storage 0 (0%) 0 (0%)
attachable-volumes-csi-dobs.csi.digitalocean.com 0 0
Events:
Helm set variable: server.service.type=ClusterIP
@torgon thank you for the information. As I understand this line here is responsible for converting the Allocatable Memory into the Long format.
For one of your nodes, the Allocatable memory is returned with different units than expected. This looks like a bug and we will take a look at it and publish a fix.
thats great, thank you !
@torgon the team has made a fix for the issue you were facing. We have an experimental release available here, can you help us verify if this works in your environment?
To use this experimental in your Helm chart you can do the following
helm install --name gocd-app --namespace gocd stable/gocd -f values.yaml
@adityasood that looks a whole lot better, seems to be fixed Thank you !
Minor follow up, though: the pod link in this screenshot is not working https://gocdserver.../go/admin/status_reports/cd.go.contrib.elasticagent.kubernetes/gocd-agent-cac2d33d-abd0-4114-9609-ea237938d645 Pod is up and running
@torgon can you please share the go-server and Kubernetes plugin logs for the same?
GoCD new install with helm GoCD 19.9.0, Plugin 3.2.0-187 / 3.3.0-191 While trying to generate a status report on the elastic agent:
jvm 1 | 2019-10-14 16:22:37,097 INFO [qtp309906614-36] p.c.g.c.e.k.c.g.c.e.KubernetesPlugin:72 [plugin-cd.go.contrib.elasticagent.kubernetes] - [refresh-pod-state] Pod information successfully synced. All(Running/Pending) pod count is 0. jvm 1 | 2019-10-14 16:22:37,100 INFO [qtp309906614-36] p.c.g.c.e.k.c.g.c.e.KubernetesPlugin:72 [plugin-cd.go.contrib.elasticagent.kubernetes] - [status-report] Generating status report. jvm 1 | 2019-10-14 16:22:37,205 ERROR [qtp309906614-36] p.c.g.c.e.k.c.g.c.e.KubernetesPlugin:127 [plugin-cd.go.contrib.elasticagent.kubernetes] - Error while generating status report: For input string: "6694Mi" jvm 1 | java.lang.NumberFormatException: For input string: "6694Mi" jvm 1 | at java.base/java.lang.NumberFormatException.forInputString(Unknown Source) jvm 1 | at java.base/java.lang.Long.parseLong(Unknown Source) jvm 1 | at java.base/java.lang.Long.valueOf(Unknown Source) jvm 1 | at cd.go.contrib.elasticagent.model.KubernetesNode.(KubernetesNode.java:56)
jvm 1 | at cd.go.contrib.elasticagent.model.KubernetesCluster.lambda$new$0(KubernetesCluster.java:37)
jvm 1 | at java.base/java.util.stream.ReferencePipeline$3$1.accept(Unknown Source)
jvm 1 | at java.base/java.util.ArrayList$ArrayListSpliterator.forEachRemaining(Unknown Source)
jvm 1 | at java.base/java.util.stream.AbstractPipeline.copyInto(Unknown Source)
jvm 1 | at java.base/java.util.stream.AbstractPipeline.wrapAndCopyInto(Unknown Source)
jvm 1 | at java.base/java.util.stream.ReduceOps$ReduceOp.evaluateSequential(Unknown Source)
jvm 1 | at java.base/java.util.stream.AbstractPipeline.evaluate(Unknown Source)
Thanks for any idea how to fix this, been driving me up the wall for a few days now :(