kubernetes / kubernetes

Production-Grade Container Scheduling and Management
https://kubernetes.io
Apache License 2.0
109.91k stars 39.35k forks source link

Use `secrets` field in `NodePublishVolumeRequest` for sa tokens #118377

Open aramase opened 1 year ago

aramase commented 1 year ago

If the TokenRequests field is set in the CSIDriver spec, the tokens are generated and funneled from kubelet to driver as part of volume attributes map. The key for the tokens in the map is csi.storage.k8s.io/serviceAccount.tokens. The service account tokens can be used for workload identity in cloud providers. Especially in case of Secrets Store CSI Driver, these service account tokens are used to access external secrets store to get secrets and mount in the volume.

As these tokens are sensitive information, it doesn't feel right to include these in the volume context map along with pod name, namespace, service account name which are non-sensitive information. There is a dedicated secrets field in the proto which today is being used for nodePublishSecretRef that would be more appropriate for tokens.

To provide more context on "why this issue", we had a CVE in the Secrets Store CSI Driver (ref: CVE-2023-2878) where the service account tokens were logged as part of the GRPC request in the driver. We use the protosanitizer in the driver to sanitize the requests before logging but given volume context is not really considered a "secret", the tokens were skipped and logged. We got around this by building custom logic for logging in the driver to strip the tokens but this is something every driver would need to do to be careful and not log the tokens. A change could be made in protosanitizer to sanitize this but given volume context is a map, either entire map needs to be marked a secret or sanitizing has to be done on a field by field basis. Although this prompted me to create this issue, I think volume context is not really "secrets" and appropriate field for sensitive sa tokens would be the secrets field.

/sig storage

aramase commented 1 year ago

/kind bug

xing-yang commented 1 year ago

/triage accepted

aramase commented 1 year ago

/assign

liggitt commented 1 year ago

/remove-sig auth

msau42 commented 1 year ago

Copying discussion from slack:

Perhaps we can use a k8s-prefixed key in the csi.NodePublishVolumeRequest.secrets field (that is appended with the secrets from nodePublishVolumeSecretRef.

The next challenge would be how to roll out this change because the csi tokens feature is already GA

aramase commented 1 year ago

Perhaps we can use a k8s-prefixed key in the csi.NodePublishVolumeRequest.secrets field (that is appended with the secrets from nodePublishVolumeSecretRef.

I was thinking the same while proposing reusing secrets field. This will work.

The next challenge would be how to roll out this change because the csi tokens feature is already GA

Should we add a feature gate to start off that allows the current behavior of setting tokens in VolumeContext and set that to false by default in N+2 releases?

v1.29: ServiceAccountTokenInVolumeContext=true allows setting the token in VolumeContext and secrets field. v1.30: ServiceAccountTokenInVolumeContext=false disabled by default but can be enabled to rely on legacy behavior. v1.31: ServiceAccountTokenInVolumeContext=false and feature gate deprecated, so tokens will only be populated in secrets

liggitt commented 1 year ago

do drivers need to opt-into putting the token data in the secret field, or does that get transparently merged to the same location so any driver resolving values would get the token? if opt-in is required, a toggle could be added to TokenRequest to let them do that

aramase commented 1 year ago

do drivers need to opt-into putting the token data in the secret field, or does that get transparently merged to the same location so any driver resolving values would get the token?

drivers read the sa tokens from the VolumeContext map today as part of their implementation. With the feature gate, I'm thinking we can gradually roll out the change, giving the implementors time to switch from reading tokens from VolumeContext to secrets. I think in some release we would want to stop populating the tokens in VolumeContext.

liggitt commented 1 year ago

I don't think time is a sufficient gate to changing the behavior there... I think you need an opt in to relocate the information

msau42 commented 1 year ago

Today it is opt-in by setting CSIDriver.TokenRequests.

Perhaps a new field to TokenRequest can be added to control which CSI field it will be passed down through.

aramase commented 1 year ago

TokenRequests is a list of TokenRequest. If we define a field at the TokenRequest level, it'll be too granular. Should we define something at the CSIDriver spec level instead?

msau42 commented 1 year ago

Yeah that could work

bswartz commented 7 months ago

I agree that dropping the token from the VolumeContext needs to be an opt-in behavior. If the opt-in is done through the CSIDriver API, it will be permanently ugly for backwards compatibility reasons. The alternative is a proper CSI feature (new capability) that kubelet could query and plugins could opt-in that way. The upside to fixing this is relatively low, because plugins can work around the problem individually, and any plugin that is likely to opt-in to better behavior could also just implement the workaround.

aramase commented 4 months ago

Another CSI plugin impacted by this issue: https://github.com/kubernetes/kubernetes/issues/124759.