ForgeRock / ds-operator

ForgeRock Directory Service Operator
Other
8 stars 12 forks source link

Unable to take backup DS to google storage workload identity #63

Closed paritoshdubey closed 2 years ago

paritoshdubey commented 2 years ago

Hi All,

We are not able to take backup of forgerock DS to google storage buckets using workload identity. Our company policy doesnt allow us to create Service account keys for SAs. We are using forgerock DS version 7.1 in GKE environment version 1.20. Please let me know how can we schedule backup of DS without using google SA JSON keys.

wstrange commented 2 years ago

The backup and restore feature of the ds-operator is currently in alpha, and is probably what you want. This feature will backup the directory to a Kubernetes Volume Snapshot, and/or perform an ldif export of data to a PVC.

The operator needs to function across all Kubernetes implementations and therefore can not use platform specific features such as GCP workload identity.

Our strategy is to leave the data on a PVC, and let users configure their own processes to perform "last mile" backup to a specific platform (S3, GCS, etc).

For example, you can mount the PVC on your own Kubernetes Job that performs backup to GCS. It might use the gsutil command provided by Google to copy the data from the mounted disk to GCS storage. An AWS user would have a similiar job, but using the aws cli to copy the data. This gives you 100% control over how the data is archived.

paritoshdubey commented 2 years ago

We have tested the backup restore of DS using google SA with JSON keys mounted as secret to DS which will schedule backup and restore the data from google storage even if PVC is also deleted. we now need a soletion how can we take backup to google storage without depending on PVC

wstrange commented 2 years ago

Please take a look at the DirectoryBackup Custom Resource. This exports directory data to a PVC. From there, you can "bring your own" Kubernetes job to further process the data. You can look at the example here for inspiration:

https://github.com/ForgeRock/forgeops/blob/master/etc/ds/ds-backup-volume/gsutil.yaml

This example can be adapted to use workload identity.

The surface area of external systems (S3, GCS, ACD, Minio, etc.) is too broad for the operator to cover all use cases. By focusing on backing up to a staging PVC, the operator is "Kubernetes Native", while still giving users the ability to bring their own custom job for backup that meet thier desired policy.

For general purpose backup and restore, you might also want to look at https://velero.io. It is well supported, and has an active community. They also support a number of backends, including GCS

wstrange commented 2 years ago

I have added a sample in https://github.com/ForgeRock/ds-operator/tree/master/tests/gcs that demonstrates syncing the contents of the backup PVC with a GCS storage bucket. The sample uses workload identity