vmware-tanzu / secrets-manager

VMware Secrets Manager is a lightweight secrets manager to protect your sensitive data. It’s perfect for edge deployments where energy and footprint requirements are strict—See more: https://vsecm.com/
https://vsecm.com/
BSD 2-Clause "Simplified" License
137 stars 24 forks source link

Any VSecM release other than `vsecm` crashes #454

Open v0lkan opened 5 months ago

v0lkan commented 5 months ago

Executing helm install vsecm vsecm/vsecm works fine and passes all the integration and unit tests.

However, helm install mahmut vsecm/vsecm fails with the following state:

aegis@aegis:~/WORKSPACE/VSecM (ovolkan/changes)$ k get po -n vsecm-system
NAME                               READY   STATUS             RESTARTS          AGE
mahmut-safe-7b6b6d6f65-gbg9r       0/1     CrashLoopBackOff   546 (3m14s ago)   46h
mahmut-sentinel-64498cfcf5-jc278   1/1     Running            0                 46h
aegis@aegis:~/WORKSPACE/VSecM (ovolkan/changes)$ k get po -n spire-system
NAME                            READY   STATUS    RESTARTS      AGE
spire-agent-c5wwm               3/3     Running   2 (46h ago)   46h
spire-agent-wjjsz               3/3     Running   2 (46h ago)   46h
spire-server-67b8bb8d7f-k4g7v   2/2     Running   0             46h

VsecM Safe is crashing because it fails to communicate with the SPIFFE workload API.

One reason is, our predicates search for vsecm-sentinel by default unless VSECM_SENTINEL_SPIFFEID_PREFIX is set.

func SentinelSpiffeIdPrefix() string {
    p := os.Getenv("VSECM_SENTINEL_SPIFFEID_PREFIX")
    if p == "" {
        p = "spiffe://vsecm.com/workload/vsecm-sentinel/ns/vsecm-system/sa/vsecm-sentinel/n/"
    }
    return p
}

As a user, this is an implementation detail to me. — I should not need to set an env var, just to change a release name.

So, wherever that variable is interpolated, the release name should be dynamically injected; something similar to this:

   value: "spiffe://vsecm.com/workload/vsecm-safe/ns/vsecm-system/sa/{{ .Release.Name }}-safe/n/"

instead of

   value: "spiffe://vsecm.com/workload/vsecm-safe/ns/vsecm-system/sa/vsecm-safe/n/"
v0lkan commented 5 months ago

@abhishek44sharma: mentioning you to keep you in the loop; there’s already someone in the community to work on this, and you might want to help them if there’s any.

I asked him to comment on this issue so that we can assign him as the owner of the issue.

sahinakyol commented 5 months ago

Hello, could you assign to me @v0lkan ?

v0lkan commented 5 months ago

The best way to work on this, I think, is to first break it.

As in, run the helm install mahmut vsecm/vsecm and then

kubectl get clusterspiffeid and kubectl describe clusterspiffeid

kubectl get serviceaccount -n vsecm-system etc.

v0lkan commented 5 months ago

more details

Name:         mahmut-safe
Namespace:
Labels:       app.kubernetes.io/managed-by=Helm
Annotations:  meta.helm.sh/release-name: mahmut
              meta.helm.sh/release-namespace: default
API Version:  spire.spiffe.io/v1alpha1
Kind:         ClusterSPIFFEID
Metadata:
  Creation Timestamp:  2024-01-29T23:28:53Z
  Generation:          1
  Resource Version:    214641
  UID:                 e444ad6b-1ac0-43f9-9f6d-e9359b5bef88
Spec:
  Pod Selector:
    Match Labels:
      app.kubernetes.io/name:     mahmut-safe
      app.kubernetes.io/part-of:  vsecm-system
  Spiffe ID Template:             spiffe://vsecm.com/workload/mahmut-safe/ns/{{ .PodMeta.Namespace }}/sa/{{ .PodSpec.ServiceAccountName }}/n/{{ .PodMeta.Name }}
  Workload Selector Templates:
    k8s:ns:vsecm-system
    k8s:sa:vsecm-safe
Status:
  Stats:
    Entries Masked:             0
    Entries To Set:             1
    Entry Failures:             0
    Namespaces Ignored:         4
    Namespaces Selected:        6
    Pod Entry Render Failures:  0
    Pods Selected:              1
Events:                         <none>

However releaseName is not interpolated in VSECM_SENTINEL_SPIFFEID_PREFIX env var that vsecm-safe has.

The resolve spiffeid should be

spiffe://vsecm.com/workload/mahmut-safe/ns/vsecm-system/sa/mahmut-safe/n/

and those mahmuts shall come from {{ .releaseName }} dynamically.

Also

aegis@aegis:~/WORKSPACE/VSecM (main)$ k get sa -n vsecm-system
NAME             SECRETS   AGE
default          0         4m40s
vsecm-safe       0         4m40s
vsecm-sentinel   0         4m40s

these should have been mahmut-safe and mahmut-sentinel

And since releases are typically designed to be isolated, maybe even the namespace shall be mahmut-system.

v0lkan commented 4 months ago

Assigned this to myself too; I may look into it sometime, but I have higher prio items first.