pegasystems / pega-helm-charts

Orchestrate a Pega Platform™ deployment by using Docker, Kubernetes, and Helm to take advantage of Pega Platform Cloud Choice flexibility.
https://community.pega.com/knowledgebase/articles/cloud-choice
Apache License 2.0
122 stars 194 forks source link

Add livenessProbe: & readinessProbe: to Pega Helm Chart #407

Open denzer101 opened 2 years ago

denzer101 commented 2 years ago

Hi

We are trying to do Pega upgrade and Pega install on the helm charts, during the install or dry-run we find errors with livenessProbe: & readinessProbe. There is also an issue with the runAsuser as Pega's default does not work in all company's.

From troubleshooting and testing these charts we have found the output references (pega-zdt-upgrade) in the pega-installer-job in the helm chart. If possible could pega add these changes to the _pega-installer-job.tpl in the container section. This could help on the current issues we are having.

Pega-installerjob

Warning FailedCreate 24s job-controller Error creating: admission webhook "validation.gatekeeper.sh" denied the request: [psp-pods-allowed-user-ranges] Container pega-in-place-upgrade is attempting to run without a required securityContext/runAsUser. Allowed runAsUser: {"ranges": [{"max": 65535, "min": 100}], "rule": "MustRunAs"} [restricted-capabilities] container is not dropping all required capabilities. Container must drop all of ["KILL", "MKNOD", "SYS_CHROOT"] [must-have-probes] Container in your has no [must-have-probes] Container in your has no

Output from running helm install --dry-run (note helm upgrade command does not work on an empty namespace) if the above section is updated we should have the output after CPU & Memory from our values.yaml.

Containers: pega-in-place-upgrade: Image: pega-installer:8.6.3 Port: 8080/TCP Host Port: 0/TCP Limits: cpu: 500m memory: 1000Mi Requests: cpu: 250m memory: 500Mi Environment Variables from: pega-upgrade-environment-config ConfigMap Optional: false Environment: ACTION: upgrade Mounts: /opt/pega/config from pega-volume-installer (rw) /opt/pega/secrets from pega-volume-credentials (rw) Volumes:

petejo commented 2 years ago

The installer is a K8S kind of 'Job' and not a 'Pod' therefore it doesn't make sense to have a readiness probe (it's not a service). For health (deadlock detection), we will look to provide a script that can be called via exec which will 'watch' the Job by tracking the log timestamp. If deadlock detection is an immediate concern for you, then you may want to perform the watching with a sidecar container which provides you with implementation flexibility that can be adjusted to your specific deployment.

denzer101 commented 2 years ago

The installer upgrade fails without having readiness & liveness probs, i might have figured out away around this but the job fails over this.

image

Thanks

RyanStan commented 2 years ago

This issue is likely due to security policy configuration that was set on the GKE cluster. From this Google Cloud article about Gatekeeper:

Using Gatekeeper allows administrators to define policies with a constraint, which is a set of conditions that permit or deny deployment behaviors in Kubernetes. You can then enforce these policies on a cluster using a ConstraintTemplate

Try looking for ConstraintTemplate resources in your cluster, you may need to modify these so that the Pega charts can deploy. As petejo said above, we'll look into providing a liveness probe in the future for health, but a readiness probe does not make sense in this case.

APegaDavis commented 1 year ago

@denzer101 were you able to work around this issue using Ryan's suggestion?

kishorv10 commented 2 months ago

@denzer101 Can you update the latest status? Are you able to work around the issue?

github-actions[bot] commented 1 week ago

This issue has been marked as stale because it has been open for 60 days with no activity. This issue will be automatically closed in 30 days if no further activity occurs.