Open marpaia opened 6 years ago
@marpaia thanks for this request. Could you please explain in a bit more detail what the flow looks like between Ark and the separate project that contains the key? Would you specify the project and the key in the backupStorageProvider
config, and would Ark retrieve the key from the specified project?
Encrypting backups at rest is for sure not something that is unique to GCP, but in GCP you need three bits of information to encrypt/decrypt a blob:
I think Ark is already 1:1 tied with a GCP Project, but it's worth noting that GCP has a Separation of Duties document which outlines the best practice of storing your KMS key-ring in an isolated project.
The code that I have now is an implementation of the following interface for storing a *corev1.Secret
in GCS encrypted at rest via KMS:
package secret
// Store is the interface which defines the controllers interactions with an
// arbitrary exo-cluster secret storage mechanism.
type Store interface {
Get(ctx context.Context, namespace string, name string) (*corev1.Secret, error)
List(ctx context.Context, namespace string) ([]*corev1.Secret, error)
Put(ctx context.Context, s *corev1.Secret) error
Delete(ctx context.Context, namespace, name string) error
}
I don't think my implementation would be super useful to you because you can current only use the KMS API to encrypt/decrypt chunks of data which are 64KiB. From the docs:
Cloud KMS can handle secrets up to 64 KiB in size. If you need to encrypt larger secrets, it is recommended that you use a key hierarchy, with a locally-generated data encryption key (DEK) to encrypt the secret, and a key encryption key (KEK) in Cloud KMS to encrypt the DEK. To learn more about DEKs, see Envelope Encryption.
This envelope encryption song and dance is kind of annoying. KMS has one job IMO: decrypt/encrypt my damn stuff. Anyway, we avoid it entirely by just implementing an interface like the above. This allows us to just deals with individual secrets, so they're all smaller than 64KiB in our environment.
Let me know if some of our KMS snippets would be helpful and I can share them privately @ncdc.
It's also worth noting that the link I posted above also allows you to encrypt an object with a single 32-byte AES-256 key. Rather than using envelope encryption to encrypt the tar, it would probably be easiest to use KMS to encrypt the key (ala-DEK) and store that in GCS as well. Decrypt just the key via KMS and use this API to encrypt the entire tar with the one key. The key rotation and access control story is not as good with this solution, but it's a much simpler solution in general.
It might be get to get @mattmoyer's thoughts on this as I don't fully understand the chain of trust using KMS and the impact on Ark & security. How does something like vault or sealed-secrets play into this?
We should also look at the other major cloud providers (and any bare-metal equivalents) to make sure this feature will work across a variety of platforms.
Agreed, @mattmoyer let us know if you have some time to discuss
Hows this? Allow using kubeseal for secrets, just add optional params, '--kubeseal' which now requires '--secret' '--controller' etc. The tool then encrypts the backup using a generated key as a secret, secret is sealed, exists as 'secret-name' then for restore, it would only need to use that same secret for decryption.
Might need fineseing but I currently have to install components on the pod and do Etcdctl snapshot so it'd be awesome to have it running as a k8s batch job.
I'll help if I can, not a go programmer YET.
@erasmus74 thanks for your idea! @mattmoyer WDYT?
There was also a suggestion from #ark-dr to use https://github.com/mozilla/sops/.
@erasmus74 are you describing using kubeseal to seal the entire backup tarball, or just the secrets contained in the tarball?
Do we want to encrypt everything or just the secrets? if just secrets then wouldnt running encryption at rest suffice?
@kzap Velero uses the Kubernetes API server to backup resources, instead of backing up from etcd directly. This means that even if encryption at rest is enabled, Velero will backup the plaintext Secret because it is decrypted by the Kubernetes API server.
From the Velero perspective, I imagine it would be easiest to encrypt the full backup.
In the meantime, you could use something like sealed-secrets and only backup the encrypted SealedSecret resources instead of Secrets.
Thank you for clarifying, is there a way to encrypt the full backup before storing in in Storage? Can restic take care of this part for us?
Unfortunately not really, the backups with restic are encrypted with a static key (see https://github.com/vmware-tanzu/velero/issues/1053). Even if that work is done though, restic is only used to store volume snapshots, so the resources that were backed up also needs to be encrypted and stored.
AFAIU, it is already possible to configure (some of ?) the providers for encryption at rest using config
field in BackupStorageLocation
:
GCP: https://github.com/vmware-tanzu/velero-plugin-for-gcp/blob/master/backupstoragelocation.md AWS: https://github.com/vmware-tanzu/velero-plugin-for-aws/blob/master/backupstoragelocation.md
Also a related PR: https://github.com/vmware-tanzu/velero/pull/1879
I think this is not related to snapshotting volumes but already makes it possible to encrypt kubernetes resources (including secrets) at rest.
What is the difference with the issue here? Can't we say "velero already supports encrypting backups at rest?"
@turkenh you're correct, Velero already supports server-side encryption at rest in the AWS, Azure and GCP plugins.
As part of this issue, we also discussed client-side encryption, i.e. Velero encrypts the backup data before sending it up to object storage, rather than letting the object storage system itself encrypt the data.
We've decided not to go down that path for now, so I do think we could probably close this issue out, and reopen/open new ones if we decide to look at client-side encryption down the road.
We're planning to do a big backlog review in the near future, so when we look at this issue as a team we can decide if we're ready to close it.
About client-side encryption, why not implementing the aws-sdk-go client-side-encryption using the "Option 2: Using a master key stored within your application" explained here: https://docs.aws.amazon.com/AmazonS3/latest/dev/UsingClientSideEncryption.html#client-side-encryption-client-side-master-key-intro ?
Edit: updated link
I found this issue while searching to find if Vault could be used as a KMS. https://github.com/libopenstorage/secrets could be used as an abstraction layer, the KMS it supports are shown on the top directory of that project. It's an abstraction on these projects, each of which provides different features and benefits. Most importantly, Valero would be more adaptable for enterprise deployments that already depend on one of these KMS.
@fgimenezm and @briantopping thanks for these inputs, definitely helpful when we get around to designing for this. Keep suggestions and ideas coming, thank you.
Additional explanation from a Velero dev that helped me understand this better:
There are two places we might want to think about encryption for encryption at rest: the block storage where the volume snapshotters store snapshots and the object storage where restic backups are sent and where the metadata tarball is stored. Block storage largely already has encryption. For example, if you are using EBS, you can enable encryption there. Velero doesn’t need to know about encryption or how to unencrypt since EBS handles that all under the covers. Velero just triggers the EBS APIs with a snapshot ID to read the data during a restore. However, object storage is much trickier. While encryption may already be available through, say, S3, Velero would have to actually unencrypt the data before it can do a restore. This means that Velero would have to handle user keys and what would happen if a user loses their key. Because this could result in users being locked out of their backups and other security issues, we want to tread carefully here.(Note from PM: Encryption at Rest is on the Velero roadmap, but we first want to investigate possibilities so we land on the safest solution for Velero users.)
Hello,
do I understand correctly that it's still not supported to encrypt Kubernetes resources and their metadata or is this possible by using Kopia now?
Kopia do not cover non-volume resources. Velero still do not support k8s manifests encryption at rest.
Thanks for that fast response and clarification. Are there any plans to support it soon?
Encrypted Kubernetes resources would be important for us to backup sensitive data, but also to ensure that the integrity of a backup is not altered by malicious intent.
Not soon. There are other ongoing work more prioritized at the moment IMO.
Feel free to submit a design and implementation to help move this requirement along and we'll review them accordingly.
Right now, when backups are created via
ark backup create
, sensitive objects are stored unencrypted at rest. In Google Cloud, there is excellent Go support for encrypting Google Cloud Storage objects with Google Cloud KMS in a way that works rather transparently to the caller.I would love a way to configure the KMS key to use when storing backups. The best practice is to have a separate project for the key and grant IAM permissions from there, which I could easily do for the
heptio-ark
SA that already is required when setting up access to the bucket.At @kolide, we have an internal tool that works like that which I'd like to potentially replace with Ark, so if there is a reasonable integration point within Ark for this kind of thing, perhaps some of our existing code can be upstreamed for this use-case.