gardener / etcd-backup-restore

Collection of components to backup and restore the etcd of a Kubernetes cluster.
Apache License 2.0
285 stars 99 forks source link

Strong Backup Encryption and Key Management #83

Open ThormaehlenFred opened 5 years ago

ThormaehlenFred commented 5 years ago

etcd Backup Strong Encryption Please encrypt sensitive data when stored persistently. When sensitive data is stored persistently, for example in files, it needs to be protected from unauthorized access. (If access control cannot be fully enforced where the storage takes place and) as a second line of defense against access control failures and loss of data confidentiality, the data shall be strongly encrypted. The following encryption requirements shall be met: Enable strong encryption at the storage level.

Encryption Key Management Encryption key management shall ensure that the principles of least privilege and segregation of duties can be followed by the operations team.

georgekuruvillak commented 5 years ago

The data stored in the EBS volume as well as the backup needs to encrypted. In case of data stored in EBS volumes, should encrypted volumes be enough? Etcd snapshots can be encrypted before sending to the cloud store. We need to have an extra key generated per shoot for encryption (I need to assess if the tls keys we generate for etcd server can in some way be used to generate the passphrase for encryption). We can use the crypto libraries in go-lang to use an AES cipher. I need to understand the memory/CPU overhead we should account for the encryption/decryption operation.

marwinski commented 5 years ago

We will enable etcd secret encryption (https://kubernetes.io/docs/tasks/administer-cluster/encrypt-data/) so I believe that having an encrypted etcd EBS volume is not necessary. We need to ensure however that the snapshots and full backups are encrypted.

vlerenc commented 5 years ago

Do we know whether we can do the encryption on-the-fly or do we need larger ETCD disks, because we need to keep a local copy of the data or encrypted data before we can upload it in parallel chunks?

swapnilgm commented 5 years ago

Currently we use multi-chunk upload logic with store-and-forward. So, currently expected disk size is twice the supported DB size.

With encryption, i assume we will go for block cipher mostly AES-GCM. The term on-the-fly here has two use cases. If we interpret on-the-fly as streaming encryption, then AFAIK, for block cipher encryption can't be done in streaming way entirely, So, whole data needs to be loaded in memory. We can do it in chunk manner to optimise on memory there. Now, once chunk is encrypted we can simply upload it. Now, we don't want to load entire chunk in memory and encrypt again and again in case we are failing to upload the chunk may be because of network issue. So, we might have to keep local copy of encrypted chunk. This now actual account to thrice the DB size.

Now, regarding on-the-fly decryption, again streaming decryption is not possible. So, we have to download and store encrypted snapshot locally. and then decrypt it and restore. So, twice the supported DB size. AS a part of restoration, we restore the DB file in temporary directory on disk. And once it is done we delete the corrupted actual data directory on disk. We can delete it, but in worst case, if something goes wrong and restoration doesn't succeed, then its good for debugging. And anyways we had twice DB size disk space because of store-and-forward snapshotting logic, so we were fine with keeping corrupted data directory till restoration succeeds. So, we continue to keep it this way, then at the time of restoration the disk will have corrupted DB data directory + encrypted snapshot + restored data directory. So, again thrice DB size. Otherwise we can delete corrupted data directory on restoration needs to be done. And optimise it to twice DB size here.

@amshuman-kr @shreyas-s-rao: WDYT?

amshuman-kr commented 5 years ago

@swapnilgm An alternative to avoid 3 times db size, could be if we encrypt on the fly while writing the snapshot to the disk in the first place. Then we would need only twice the size like now. WDYT?

vlerenc commented 5 years ago

I don't know whether I can follow. I assume, we can and will hold a chunk in memory (encrypt + upload and download + decrypt). We anyways limit ourselves to 5 or so chunks to not become a noisy neighbour to ETCD itself and we can keep these few chunks in memory, right? In that case, even if upload or download fails, we retry until it succeeds. So why would we need three times the space?

swapnilgm commented 5 years ago

@amshuman-kr

An alternative to avoid 3 times db size, could be if we encrypt on the fly while writing the snapshot to the disk in the first place. Then we would need only twice the size like now. WDYT? No. I think its not that straight forward and will hit back from performance point of view. Two points here:

  1. Encryption will add to the time. And etcd snapshot request will take time.
  2. As I said earlier for block cipher, one can't do streaming encryption. So, one kind of complex logic, i can think of there is
    • open snapshot reader on etcd using api call
    • read snapshot data block from etcd in memory
    • encrypt it
    • write it to file.
    • repeat till entire snapshot is read.
    • close snapshot reader. i am afraid that we will end up seeing too many snapshot took too long warning on etcd.
shreyas-s-rao commented 5 years ago

we encrypt on the fly while writing the snapshot to the disk in the first place

@amshuman-kr I agree with @swapnilgm that this will be a memory and time overhead for etcd, but we can test this out to confirm it.

@vlerenc The limit of 5 chunks is only for the upload and not for the data on disk itself. So the chunk-upload-limit of 5 would not affect the decision on volume size in any way.

I assume, we can and will hold a chunk in memory (encrypt + upload and download + decrypt)

For restoration, we will have (download + decrypted + old corrupted data)
Regarding restoration, I agree with @swapnilgm that we need thrice the data size: old corrupted etcd data + downloaded full snapshot + decrypted full snapshot + downloaded delta snapshots (6 at a time as configured currently) + decrypted delta snapshots. We would also require some extra buffer for applying delta snapshots, because the initial etcd data is the full snapshot itself but applying each delta would increase the etcd size but we don't purge the delta file immediately (currently, delta snapshots are cleaned up at the end of restoration, and there is scope for optimisation here).

@swapnilgm Can we achieve on-the-fly encryption with stream cipher? I'm guessing that would reduce the disk space requirement.

swapnilgm commented 5 years ago

@vlerenc

I don't know whether I can follow. I assume, we can and will hold a chunk in memory (encrypt + upload and download + decrypt). We anyways limit ourselves to 5 or so chunks to not become a noisy neighbour to ETCD itself and we can keep these few chunks in memory, right? In that case, even if upload or download fails, we retry until it succeeds. So why would we need three times the space?

When we say its streaming API for chunk, like with current chunk upload, entire chunk upload is happening using at IO level and hence chunk upload is not contributing to memory. Now, if we load the chunk in memory and encrypt it will add on to memory. Its not about noisy neighbour here, but then memory usage will increase to noofchunks under process chunk size*. This will add on to memory. Thought with current observations for seed its hardly 50mb for 5 chunks. But with increasing DB size it will increase. (for 2GiB db 500 Mib extra memory) AS i said in earlier comment as well, i'm fine with this approach at liberty of setting higher memory limit for sidecar.
Instead i was going for thrice disk space, cause AFAIK, considering iops limit we have to provision higher disk space than required anyways to why not use it and optimise on memory here.

swapnilgm commented 5 years ago

So at any given time, on the disk we'll have the actual etcd data + all encrypted chunks associated with the current snapshot (full or delta).

Please check comment: https://github.com/gardener/etcd-backup-restore/issues/83#issuecomment-479748253

@swapnilgm Can we achieve on-the-fly encryption with stream cipher? I'm guessing that would reduce the disk space requirement.

Please check comment https://github.com/gardener/etcd-backup-restore/issues/83#issuecomment-479751327

amshuman-kr commented 5 years ago

Let me clarify terminology, because I myself am confused by the above discussion.

amshuman-kr commented 5 years ago

@swapnilgm From above, I would prefer not to couple the chunk size in upload to the chunk size. This means that the upload chunk size would be independent of the chunk size for encryption. This makes some sort of on-the-fly (in terms of the eventual physical encrypted file and not in terms of how such encryption is achieved) mandatory.

WDYT?

amshuman-kr commented 5 years ago

Now to actual the on-the-fly encryption part. I think there are at least 3 ways to do this. I found this helpful.

  1. https://golang.org/pkg/crypto/cipher/#example_StreamWriter. Checksum would have to be implemented.
  2. A custom streaming using https://github.com/golang/go/blob/master/src/crypto/cipher/gcm.go#L17. Streaming and checksum would have to be implemented.
  3. https://github.com/minio/sio solves it by chunking and checksumming every chunk.

I am leaning towards minio/DARE. Regarding the impact on performance, I think it it will be minimal. But that is just an opinion. Actual data would beat it any day :-)

swapnilgm commented 5 years ago

Just to clarify further on terminology, here onwards i'll use chunk : chunk in multi-part upload of chunk; chunk size is dynamic w.r.t. backup size and cloud provider support. block : block or chunk used in block cipher encryption. Block size is constant Commonly 256bit=32byte

Now regarding on-the-fly encryption, please don't misunderstand me that i'm completely against it. I was putting my concerns regarding performance aspect we might have to consider. If majority feel its good i'm fine to with it and deal with performance hit back that might come later.

From whatever i could remember from my earlier lessons about cryptography and whatever i could check in last couple of days:

https://golang.org/pkg/crypto/cipher/#example_StreamWriter. Checksum would have to be implemented.

As i mentioned in very first comment, IMO we will probably want to go for block cipher mode than stream cipher, considering common use case and advantage and disadvantage. So, i didn't dare to look at it.

A custom streaming using https://github.com/golang/go/blob/master/src/crypto/cipher/gcm.go#L17. Streaming and checksum would have to be implemented.

Yes. I would prefer going for GCM. Streaming we anyways need it here. Granularity of data size (can be block or chunk size) is based on the approach we follow on-the-fly or using store-and-forward for encryption. I didn't get what are you referring by checksum here?

https://github.com/minio/sio solves it by chunking and checksumming every chunk.

This goes under streaming part in point 2. With chunked encryption approach i was also going to refer to minio/Dare.

amshuman-kr commented 5 years ago

Now regarding on-the-fly encryption, please don't misunderstand me that i'm completely against it. I was putting my concerns regarding performance aspect we might have to consider. If majority feel its good i'm fine to with it and deal with performance hit back that might come later.

To be clear, let us not a decision without doing due diligence. It would be good to know what is the performance impact.

From the cost perspective, on the first look, if we decide to go with gp2, then it might look like both options (inline as well as different file) options are available to us. Otherwise, we might have to worry more about storage optimisation.

However, we have to take into account the multiple read/write cycles for the same files in the different file approach as well. So, it may not be that straightforward even if we go for gp2.

marwinski commented 5 years ago

Sorry for the delay contributing here. I cannot really comment on the best way of doing the encryption but need to mention one important aspect: please do not copy the database (snapshot) to the target location, encrypt it and the delete the unencrypted one - it will probably remain there. If the "in memory" encryption does not work (although my gut feeling tells me that it should) encrypt in the source location (with the downside of requiring more disk space).

shreyas-s-rao commented 5 years ago

After preliminary evaluation of four approaches:

I have arrived at a few advantages and disadvantages for each of them:

Summary: As decryption (store-and-decrypt) will anyway dictate the volume size to be 3x db size, we have some flexibility in choosing the encryption strategy and can focus on optimising on memory, CPU and time there. I personally prefer Minio because of its optimised resource usage and also because it is production-ready. Again, this is just a preliminary evaluation, and we can always evaluate other viable options when we pick up this task for implementation.

@marwinski @vlerenc WDYT?

vlerenc commented 5 years ago

If we need more disk space, let's incorporate that in the new ETCD disk sizes. Luckily, we anyways picked larger disks for AWS and Azure. Let's now also go for slightly larger disks for Aliyun and GCP. 25 or 30 GB are also OK.

As for the optimisation, just the general remark that we cannot use any solution that needs to load everything into main memory and that we need to optimise for encryption rather than decryption. Encryption happens magnitudes more often than decryption. And encryption happens in parallel to normal operations, while when we decrypt, we do that only because the ETCD PV is corrupt or lost, so there is anyways no ETCD running in parallel/resource competition.

amshuman-kr commented 5 years ago

According to quick measurements from @shreyas-s-rao it looks like the overhead on encryption (in-flight) is minimal, especially on CPU utilisation and I am confident we can get the memory overhead to be minimal and O(1).

However, the decryption latency seems to be quite large for in-flight as compared to a download followed by decryption approach. This could be due to the choice of the algorithm or the particular implementation in the library. But if this remains the case generally, we might have to budget 3 times the size of etcd on the disk for decryption if not for encryption (even if we decouple the restoration from restarts in the future we would still, most probably, stick to a single volume).

vlerenc commented 5 years ago

Hmm... when reading your above sentence: In encryption (in parallel to normal operations), we can do in-flight encryption, but only there we would have needed 3x space. In decryption, we don't need it, because there is no parallel ETCD or am I wrong. In my naive view, we download everything and we decrypt it, That's only 2x. So, if we can do encryption in-flight, then 2x seems ok or am I wrong?

amshuman-kr commented 5 years ago

In decryption, we don't need it, because there is no parallel ETCD or am I wrong.

At present, we do not delete the old etcd folder during restoration and replace it only after restoration has succeeded. I think this was us being conservative. If we change this then 2x would be fine for decryption too.

swapnilgm commented 5 years ago

As per discussion, in the context of Gardener or to be precise etcd backing kubernetes we make use of k8s secret encryption feature, so basically critical data is encrypted.

But since etcd-backup-restore is open-source project and may be used for HA of any etcd, the encryption feature for backups is brings good value.

Hence, we will keep it open and can be implemented once we are done with other priority tasks.

vlerenc commented 3 years ago

There are two more reasons why we may need to build that: (1) Metadata (resource name) is partly also sensitive information and (2) other, partly unknown resources like custom resources, may and often will contain sensitive information, too. Unless (1) is also encrypted (resource names), what I don't know and we make the to-be-encrypted resources configurable in the shoot resource to counter (2), we may still need to build backup encryption.

/cc @ThormaehlenFred

vlerenc commented 3 years ago

In addition, I just noticed, we have another problem. We keep old backup entries for some time, but since we introduced ETCD encryption keys, nobody can read the encrypted resources anymore once the cluster gets deleted and the ETCD encryption key is deleted (in the shoot namespace in the seed cluster and the shoot state in the garden cluster), isn't it so @shreyas-s-rao ?