defenseunicorns / uds-bundle-software-factory-nutanix

A UDS Bundle
Apache License 2.0
2 stars 0 forks source link

Configure gitaly with recommended resiliency settings #221

Closed jacobbmay closed 3 weeks ago

jacobbmay commented 1 month ago

Gitlab recommends if you are going to run Gitaly in kubernetes to configure it cgroups enabled in the chart and that the pod is able to access a node mountpoint to /sys/fs/cgroup.

The instructions for configuring that via chart values are here.

EDIT: Turned this into a general improved tuning ticket as there were a few other settings I encountered in performing #213 which should be checked as well.

### Tasks
- [ ] cgroups control enabled and configurable at deploy time
- [x] Address pod disruption (https://docs.gitlab.com/ee/administration/gitaly/kubernetes.html)
jacobbmay commented 1 month ago

Specific recommendations about limits relative to pod resources are mentioned here. This helps prevent gitaly from getting OOM terminated.

JoeHCQ1 commented 1 month ago

Removed myself b/c not actively working this as I switch to the RKE2 installation instructions. Need to redeploy the EKSD clusters into RKE2.

jacobbmay commented 1 month ago

Ticket title was misleading as we are not configuring HA Gitaly until GitLab finishes getting Gitaly cluster in Kubernetes ready for general availability. Changed to "HA" to resiliency.

JoeHCQ1 commented 1 month ago

Pepr requires that I setup a policy exception to enable cgroups.

JoeHCQ1 commented 4 weeks ago

Links for future reference:

JoeHCQ1 commented 4 weeks ago

Internal slack convo on security risks with the permissions the gitaly init container is requiring: https://defense-unicorns.slack.com/archives/C06QJAUHWFN/p1730316451522489

The current error: Error: changing permissions for Gitaly pod 5d02861e-0cce-48d1-8789-a7f0354e6c6c cgroups: chown cgroup path "": chown /run/gitaly/cgroup/kubepods.slice/kubepods-burstable.slice/kubepods-burstable-pod5d02861e_0cce_48d1_8789_a7f0354e6c6c.slice: permission denied

JoeHCQ1 commented 4 weeks ago

Adding the DisallowPrivileged exception didn't fix it either. I might need to get exceptions added for: Restrict hostPath Volume Mountable Paths - note there are two policies by that name back to back.