aerospike / aerospike-kubernetes-operator

Kubernetes operator for the Aerospike database
https://docs.aerospike.com/cloud/kubernetes/operator
Apache License 2.0
92 stars 37 forks source link

Getting "Reconciler error" when trying to run aerospike restore using CR #323

Closed alicvsroxas closed 3 days ago

alicvsroxas commented 1 week ago

Hello team,

I'm trying to use the aerospikerestore CR and restore data from an S3 bucket. Configuring aerospike backup CR works and I can see the files in S3 using:

apiVersion: asdb.aerospike.com/v1beta1
kind: AerospikeBackup
metadata:
    name: aerospikebackup
    namespace: ${namespace}
spec:
  backupService:
    name: aerospikebackupservice
    namespace: ${namespace}
  config:
    aerospike-cluster:
      ${namespace}-aerospikebackup-${aerospike_cluster}: 
        credentials:
          # Make sure the password file is mounted in the backup service pod via secret at the path specified here.
          password-path: ${password-path}
          user: ${username}
        seed-nodes:
          - host-name: ${aerospike_cluster}.${namespace}.svc.cluster.local
            port: ${aerospike_port}
    backup-routines:
      # Name format: The name must begin with the prefix <backup-namespace>-<backup-name>
      ${namespace}-aerospikebackup-${aerospike_cluster}-routine:  
        backup-policy: default-policy
        interval-cron: "@daily"
        incr-interval-cron: "@hourly"
        namespaces: 
          - ns1
        source-cluster: ${namespace}-aerospikebackup-${aerospike_cluster}
        storage: s3Storage

But when I try and do a full / incremental with the following CR:

apiVersion: asdb.aerospike.com/v1beta1
kind: AerospikeRestore
metadata:
  name: aerospikerestore
  namespace: ${namespace}
spec:
  backupService:
    name: aerospikebackupservice
    namespace: ${namespace}
  type: Full
  config:
    destination:
      label: destinationCluster
      credentials:
        # Make sure the password file is mounted in the backup service pod via secret at the path specified here.
        password-path: ${password-path}
        user: ${username}
      seed-nodes:
        - host-name: ${aerospike_cluster}.${namespace}.svc.cluster.local
          port: ${aerospike_port}
    policy:
      no-generation: true
      no-indexes: true
    source:
      "path": "s3://${bucket_name}/${routine_name}/backup/1730897836735/data/ns1"
      "type": aws-s3
      "s3-region": ${aws_region}

I get the following error from the aerospike operator:

manager 2024-11-06T14:23:17Z    ERROR    Reconciler error    {"controller": "aerospikerestore", "controllerGroup": "asdb.aerospike.com", "controllerKind": "AerospikeRestore", "AerospikeRestore": {"name":"aerospikerestore","namespace":"aerospike"}, "namespace": "aerospike", "name": "aerospikerestore", "reconcileID": "c38aa310-0765-46f4-abf0-af3f292d4cbc", "error": "Post \"http://aerospikebackupservice.aerospike.svc:8081/v1/restore/full\": EOF"}
manager sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler
manager     /go/pkg/mod/sigs.k8s.io/controller-runtime@v0.17.3/pkg/internal/controller/controller.go:329
manager sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem
manager     /go/pkg/mod/sigs.k8s.io/controller-runtime@v0.17.3/pkg/internal/controller/controller.go:266
manager sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2
manager     /go/pkg/mod/sigs.k8s.io/controller-runtime@v0.17.3/pkg/internal/controller/controller.go:227

When trying to do a restore in the routine and timestamp method:

apiVersion: asdb.aerospike.com/v1beta1
kind: AerospikeRestore
metadata:
  name: aerospikerestore
  namespace: ${namespace}
spec:
  backupService:
    name: aerospikebackupservice
    namespace: ${namespace}
  type: Full
  config:
    destination:
      label: destinationCluster
      credentials:
        # Make sure the password file is mounted in the backup service pod via secret at the path specified here.
        password-path: ${password-path}
        user: ${username}
      seed-nodes:
        - host-name: ${aerospike_cluster}.${namespace}.svc.cluster.local
          port: ${aerospike_port}
    policy:
      no-generation: true
      no-indexes: true
    routine: ${namespace}-aerospikebackup-${aerospike_cluster}-routine
    time: 1730897836735

I get the following errors:

manager 2024-11-06T14:36:24Z    ERROR    controller.AerospikeRestore    Failed to trigger restore with status code 400    {"aerospikerestore": {"name":"aerospikerestore","namespace":"aerospike"}, "error": "failed to trigger Timestamp restore, error: restore failed: backup not found: 2024-11-06 12:57:16.735 +0000 UTC\n"}
manager github.com/aerospike/aerospike-kubernetes-operator/internal/controller/restore.(*SingleRestoreReconciler).reconcileRestore
manager     /workspace/internal/controller/restore/reconciler.go:96
manager github.com/aerospike/aerospike-kubernetes-operator/internal/controller/restore.(*SingleRestoreReconciler).Reconcile
manager     /workspace/internal/controller/restore/reconciler.go:42
manager github.com/aerospike/aerospike-kubernetes-operator/internal/controller/restore.(*AerospikeRestoreReconciler).Reconcile
manager     /workspace/internal/controller/restore/aerospikerestore_controller.go:72
manager sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Reconcile
manager     /go/pkg/mod/sigs.k8s.io/controller-runtime@v0.17.3/pkg/internal/controller/controller.go:119
manager sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler
manager     /go/pkg/mod/sigs.k8s.io/controller-runtime@v0.17.3/pkg/internal/controller/controller.go:316
manager sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem
manager     /go/pkg/mod/sigs.k8s.io/controller-runtime@v0.17.3/pkg/internal/controller/controller.go:266
manager sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2
manager     /go/pkg/mod/sigs.k8s.io/controller-runtime@v0.17.3/pkg/internal/controller/controller.go:227
manager 2024-11-06T14:36:24Z    ERROR    Reconciler error    {"controller": "aerospikerestore", "controllerGroup": "asdb.aerospike.com", "controllerKind": "AerospikeRestore", "AerospikeRestore": {"name":"aerospikerestore","namespace":"aerospike"}, "namespace": "aerospike", "name": "aerospikerestore", "reconcileID": "782862bf-82eb-42fa-9286-99b426498035", "error": "nil terminal error"}
manager sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler
manager     /go/pkg/mod/sigs.k8s.io/controller-runtime@v0.17.3/pkg/internal/controller/controller.go:329
manager sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem
manager     /go/pkg/mod/sigs.k8s.io/controller-runtime@v0.17.3/pkg/internal/controller/controller.go:266
manager sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2
manager     /go/pkg/mod/sigs.k8s.io/controller-runtime@v0.17.3/pkg/internal/controller/controller.go:227

I'm following the examples in the following docs: https://aerospike.com/docs/cloud/kubernetes/operator/backup-and-restore/restore-configuration and taking reference from here: https://aerospike.com/docs/tools/backup-service/examples#direct-restore-using-a-specific-backup

Did I miss some configuration? I tried multiple ways of setting up the path to the s3 bucket but none seems to work.

alicvsroxas commented 3 days ago

Managed to find solution with support. When doing a restore with an S3 bucket the same configurations of the s3storage part in the ABS CR should be the same in the Aerospike restore CR:

apiVersion: asdb.aerospike.com/v1beta1
kind: AerospikeRestore
metadata:
  name: aerospikerestore
  namespace: ${namespace}
spec:
  backupService:
    name: aerospikebackupservice
    namespace: ${namespace}
  type: Full
  config:
    destination:
      label: destinationCluster
      credentials:
        # Make sure the password file is mounted in the backup service pod via secret at the path specified here.
        password-path: ${password-path}
        user: ${username}
      seed-nodes:
        - host-name: ${aerospike_cluster}.${namespace}
          port: ${aerospike_port}
    policy:
      no-generation: true
      no-indexes: true
    source:
      "path": "s3://${bucket_name}/${routine_name}/backup/1730897836735/data/ns1"
      "type": aws-s3
      "s3-region": ${aws_region}
      // missed the following lines:
      "s3-endpoint-override": ""
      "s3-profile": ""

Would highly suggest adding in the docs an example of configuring an Aerospike restore with s3 bucket here: https://aerospike.com/docs/cloud/kubernetes/operator/backup-and-restore/restore-configuration

Closing this issue