k8ssandra / k8ssandra-operator

The Kubernetes operator for K8ssandra
https://k8ssandra.io/
Apache License 2.0
176 stars 79 forks source link

Medusa same cluster Restore is failing because it is looking for data with node ips instead of kubernetes pod names #585

Open grassiale opened 2 years ago

grassiale commented 2 years ago

Hello again k8ssandra community

What did you do? We are performing a medusa backup of a cassandra datacenter with 3 racks and trying to restore the data on the same cluster. We are not achieving that because medusa-restore is apparently trying to download data from folders that refer to ips of cassandra nodes instead of their pod names. Actually only one of the pods is able to restore the data correctly, the other ones are left by the operator in a podInitCrashloopBackoff state. There are some errors presented below in the operator logs, but they don't tell me much. Fun fact, when I tried the same procedure on a 2 racks cluster without customizations on tokens, it worked. But with the same error in logs.

Did you expect to see some different? We were expecting medusa restore to download data from the same paths on s3 they were uploaded to on all nodes

Environment

apiVersion: k8ssandra.io/v1alpha1
kind: K8ssandraCluster
metadata:
  name: k8ssandra-backup-test
spec:
  auth: false
  cassandra: 
    serverVersion: "4.0.4"
    serverImage: "k8ssandra/cass-management-api:4.0.4-v0.1.40"
    telemetry: 
      prometheus:
        enabled: true
    datacenters:
    - metadata:
        name: dc-backup-test
      size: 3
      config:
        jvmOptions:
          heapSize: "15G"
          additionalOptions:
            - "-Dcassandra.system_distributed_replication_dc_names=dc-backup-test"
            - "-Dcassandra.system_distributed_replication_per_dc=3"
        cassandraYaml:
          num_tokens: 32
          allocate_tokens_for_local_replication_factor: 3
          authenticator: AllowAllAuthenticator
          authorizer: AllowAllAuthorizer
          role_manager: CassandraRoleManager
          read_request_timeout_in_ms: 8000
          write_request_timeout_in_ms: 6000
          request_timeout_in_ms: 12000
          compaction_throughput_mb_per_sec: 200
          concurrent_compactors: 32 
          stream_entire_sstables: true
          stream_throughput_outbound_megabits_per_sec: 61440
          streaming_connections_per_host: 6
      storageConfig:
          cassandraDataVolumeClaimSpec:
            storageClassName: ssd
            accessModes:
              - ReadWriteOnce
            resources:
              requests:
                storage: 400Gi
            persistentVolumeReclaimPolicy: Retain
      networking:
        nodePort:
          native: 30001
          internode: 30002
      resources:
        requests:
          memory: 60Gi
          cpu: 7000m
      racks:
        - name: rack-a
          nodeAffinityLabels:
            topology.kubernetes.io/zone: eu-west-1a
        - name: rack-b
          nodeAffinityLabels:
            topology.kubernetes.io/zone: eu-west-1b
        - name: rack-c
          nodeAffinityLabels:
            topology.kubernetes.io/zone: eu-west-1c
  medusa:
    storageProperties:
      storageProvider: s3
      storageSecretRef:
        name: medusa-bucket-key
      bucketName: bucket-name
      prefix: cassandra-small
      region: eu-west-1
      transferMaxBandwidth: 5000MB/s
      concurrentTransfers: 16
apiVersion: medusa.k8ssandra.io/v1alpha1
kind: MedusaBackupJob
metadata:
  name: backup-20220623-dc-backup-test
spec:
  cassandraDatacenter: dc-backup-test
  backupType: full
apiVersion: medusa.k8ssandra.io/v1alpha1
kind: MedusaRestoreJob
metadata:
  name: restore-20220623-dc-backup-test
spec:
  cassandraDatacenter: dc-backup-test
  backup: backup-20220623-dc-backup-test
1.655995580045647e+09   INFO    controller.medusarestorejob Starting reconcile  {"reconciler group": "medusa.k8ssandra.io", "reconciler kind": "MedusaRestoreJob", "name": "restore-20220623-dc-small", "namespace": "k8ssandra", "medusarestorejob": "k8ssandra/restore-20220623-dc-small", "MedusaRestoreJob": "restore-20220623-dc-small"}
1.6559955800583858e+09  INFO    controller.medusarestorejob updated MedusaRestoreJob with owner reference   {"reconciler group": "medusa.k8ssandra.io", "reconciler kind": "MedusaRestoreJob", "name": "restore-20220623-dc-small", "namespace": "k8ssandra", "medusarestorejob": "k8ssandra/restore-20220623-dc-small", "CassandraDatacenter": {"namespace": "k8ssandra", "name": "dc-backup-test"}}
1.655995580060357e+09   INFO    controller.medusarestorejob Starting reconcile  {"reconciler group": "medusa.k8ssandra.io", "reconciler kind": "MedusaRestoreJob", "name": "restore-20220623-dc-small", "namespace": "k8ssandra", "medusarestorejob": "k8ssandra/restore-20220623-dc-small", "MedusaRestoreJob": "restore-20220623-dc-small"}
1.6559955800707614e+09  INFO    controller.medusatask   Starting reconciliation for MedusaTask{"reconciler group": "medusa.k8ssandra.io", "reconciler kind": "MedusaTask", "name": "f6d50670-9b1c-4e45-a27b-5973e73781d3", "namespace": "k8ssandra", "MedusaTask": "k8ssandra/f6d50670-9b1c-4e45-a27b-5973e73781d3", "cassdc": "dc-backup-test"}
1.655995580077983e+09   INFO    controller.medusatask   updated task with owner reference   {"reconciler group": "medusa.k8ssandra.io", "reconciler kind": "MedusaTask", "name": "f6d50670-9b1c-4e45-a27b-5973e73781d3", "namespace": "k8ssandra", "MedusaTask": "k8ssandra/f6d50670-9b1c-4e45-a27b-5973e73781d3", "cassdc": "dc-backup-test", "CassandraDatacenter": {"namespace": "k8ssandra", "name": "dc-backup-test"}}
1.6559955800791261e+09  INFO    controller.medusatask   Starting reconciliation for MedusaTask{"reconciler group": "medusa.k8ssandra.io", "reconciler kind": "MedusaTask", "name": "f6d50670-9b1c-4e45-a27b-5973e73781d3", "namespace": "k8ssandra", "MedusaTask": "k8ssandra/f6d50670-9b1c-4e45-a27b-5973e73781d3", "cassdc": "dc-backup-test"}
1.6559955800794597e+09  INFO    controller.medusatask   Starting prepare restore operations {"reconciler group": "medusa.k8ssandra.io", "reconciler kind": "MedusaTask", "name": "f6d50670-9b1c-4e45-a27b-5973e73781d3", "namespace": "k8ssandra", "MedusaTask": "k8ssandra/f6d50670-9b1c-4e45-a27b-5973e73781d3", "cassdc": "dc-backup-test", "cassdc": "dc-backup-test"}
1.6559955800893202e+09  INFO    controller.medusarestorejob Starting reconcile  {"reconciler group": "medusa.k8ssandra.io", "reconciler kind": "MedusaRestoreJob", "name": "restore-20220623-dc-small", "namespace": "k8ssandra", "medusarestorejob": "k8ssandra/restore-20220623-dc-small", "MedusaRestoreJob": "restore-20220623-dc-small"}
1.655995580090005e+09   ERROR   controller.medusarestorejob Failed to prepare restore   {"reconciler group": "medusa.k8ssandra.io", "reconciler kind": "MedusaRestoreJob", "name": "restore-20220623-dc-small", "namespace": "k8ssandra", "medusarestorejob": "k8ssandra/restore-20220623-dc-small", "error": "prepare restore task failed for restore restore-20220623-dc-small"}
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Reconcile
    /go/pkg/mod/sigs.k8s.io/controller-runtime@v0.11.1/pkg/internal/controller/controller.go:114
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler
    /go/pkg/mod/sigs.k8s.io/controller-runtime@v0.11.1/pkg/internal/controller/controller.go:311
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem
    /go/pkg/mod/sigs.k8s.io/controller-runtime@v0.11.1/pkg/internal/controller/controller.go:266
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2
    /go/pkg/mod/sigs.k8s.io/controller-runtime@v0.11.1/pkg/internal/controller/controller.go:227
1.65599558009014e+09    ERROR   controller.medusarestorejob Reconciler error    {"reconciler group": "medusa.k8ssandra.io", "reconciler kind": "MedusaRestoreJob", "name": "restore-20220623-dc-small", "namespace": "k8ssandra", "error": "prepare restore task failed for restore restore-20220623-dc-small"}
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem
    /go/pkg/mod/sigs.k8s.io/controller-runtime@v0.11.1/pkg/internal/controller/controller.go:266
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2
    /go/pkg/mod/sigs.k8s.io/controller-runtime@v0.11.1/pkg/internal/controller/controller.go:227
1.6559955800904531e+09  INFO    controller.medusatask   starting pod operation  {"reconciler group": "medusa.k8ssandra.io", "reconciler kind": "MedusaTask", "name": "f6d50670-9b1c-4e45-a27b-5973e73781d3", "namespace": "k8ssandra", "MedusaTask": "k8ssandra/f6d50670-9b1c-4e45-a27b-5973e73781d3", "cassdc": "dc-backup-test", "Operation": "prepare restore", "CassandraPod": "k8ssandra-backup-test-dc-backup-test-rack-b-sts-0"}
1.6559955800908427e+09  INFO    controller.medusatask   Starting reconciliation for MedusaTask{"reconciler group": "medusa.k8ssandra.io", "reconciler kind": "MedusaTask", "name": "f6d50670-9b1c-4e45-a27b-5973e73781d3", "namespace": "k8ssandra", "MedusaTask": "k8ssandra/f6d50670-9b1c-4e45-a27b-5973e73781d3", "cassdc": "dc-backup-test"}
1.6559955800908892e+09  INFO    controller.medusatask   Tasks already in progress   {"reconciler group": "medusa.k8ssandra.io", "reconciler kind": "MedusaTask", "name": "f6d50670-9b1c-4e45-a27b-5973e73781d3", "namespace": "k8ssandra", "MedusaTask": "k8ssandra/f6d50670-9b1c-4e45-a27b-5973e73781d3", "cassdc": "dc-backup-test"}
1.6559955800957592e+09  INFO    controller.medusarestorejob Starting reconcile  {"reconciler group": "medusa.k8ssandra.io", "reconciler kind": "MedusaRestoreJob", "name": "restore-20220623-dc-small", "namespace": "k8ssandra", "medusarestorejob": "k8ssandra/restore-20220623-dc-small", "MedusaRestoreJob": "restore-20220623-dc-small"}

Anything else we need to know?: Logs from the container: medusa-restore-logs.log

┆Issue is synchronized with this Jira Story by Unito ┆Issue Number: K8OP-178

grassiale commented 2 years ago

Hello, do you have any updates on this issue? We are currently working around this by manually substituting the mappings on nodes. What essentially happens is that the nodes of the cassandra cluster (all except one, actually) will have a mapping in /var/lib/cassandra/.restore_mappings that contains ips instead of pod names:

{
    "host_map":
    {
        "10.71.16.133":
        {
            "seed": false,
            "source":
            [
                "k8ssandra-rack-a-sts-0"
            ]
        },
        "10.71.86.153":
        {
            "seed": false,
            "source":
            [
                "10.71.47.236"
            ]
        },
        "localhost":
        {
            "seed": false,
            "source":
            [
                "10.71.47.233"
            ]
        }
    },
    "in_place": false
}

What we do is, scale the operator to 0 pods, scale the statefulsets to 0, substitute the ip in the localhost key in the json with the respective pod name, on every volume of the cluster and restart the operator. This makes restores work. But is obviously not ideal.

If you think I should open an issue to https://github.com/thelastpickle / cassandra-medusa because it is not inherent to k8ssandra, please let me know.

adejanovski commented 2 years ago

Hi @grassiale,

if the source in the mapping looks like what you're seeing (one with the pod name, the others with IP addresses), then it means the pods aren't managing to resolve other pods IPs to their hostname. You can verify this by checking the topology*.json files in the backup metadata, in the S3 bucket, which should show the same pattern (one pod name, two ip addresses).

Could you check which version of cass-operator is running? We made the necessary changes to fix pod ip address resolving in v1.11.0, and that's what k8ssandra-operator should install when running v1.1.0 and v1.1.1.

We're going to need the output of kubectl get cassdc/dc-backup-test -o yaml and the same for the stateful set kubectl get statefulset/k8ssandra-backup-test-dc-backup-test-rack-a-sts -o yaml. I'd like to check what's the serviceName value there.

adejanovski commented 2 years ago

Hi @grassiale, any update on this issue?

grassiale commented 2 years ago

Sorry, I was away for my PTO and it slipped my mind. Will have a look at it tomorrow.

grassiale commented 2 years ago

OK, sorry for the huge delay. I did a fresh test with fresh resources.

I'm using k8ssandra-operator:v1.2.0 and the image for cass-operator is k8ssandra/cass-operator:v1.12.0

the output of getting cass data center is:

apiVersion: cassandra.datastax.com/v1beta1
kind: CassandraDatacenter
metadata:
  annotations:
    k8ssandra.io/resource-hash: 6ZKMF06HT8XN93IFEk45NnP4BTzRUOMB8jYeiuVa9JU=
  creationTimestamp: "2022-10-07T10:16:45Z"
  finalizers:
  - finalizer.cassandra.datastax.com
  generation: 4
  labels:
    app.kubernetes.io/component: cassandra
    app.kubernetes.io/created-by: k8ssandracluster-controller
    app.kubernetes.io/name: k8ssandra-operator
    app.kubernetes.io/part-of: k8ssandra
    k8ssandra.io/cluster-name: dc-backup-test
    k8ssandra.io/cluster-namespace: k8ssandra
  name: dc-backup-test
  namespace: k8ssandra
  resourceVersion: "48656953"
  uid: b550c8e1-1649-4def-9f3b-61ecba8d0add
spec:
  additionalServiceConfig:
    additionalSeedService: {}
    allpodsService: {}
    dcService: {}
    nodePortService: {}
    seedService: {}
  clusterName: dc-backup-test
  config:
    cassandra-env-sh:
      additional-jvm-opts:
      - -Dcassandra.system_distributed_replication=dc-backup-test:3
      - -Dcom.sun.management.jmxremote.authenticate=false
      - -Dcassandra.system_distributed_replication_dc_names=dc-backup-test
      - -Dcassandra.system_distributed_replication_per_dc=3
    cassandra-yaml:
      allocate_tokens_for_local_replication_factor: 3
      authenticator: AllowAllAuthenticator
      authorizer: AllowAllAuthorizer
      compaction_throughput_mb_per_sec: 200
      concurrent_compactors: 32
      num_tokens: 16
      read_request_timeout_in_ms: 8000
      request_timeout_in_ms: 12000
      role_manager: CassandraRoleManager
      stream_entire_sstables: true
      stream_throughput_outbound_megabits_per_sec: 61440
      streaming_connections_per_host: 6
      write_request_timeout_in_ms: 6000
    jvm-server-options:
      initial_heap_size: 1000000000
      max_heap_size: 1000000000
  configBuilderResources: {}
  managementApiAuth: {}
  networking:
    nodePort:
      internode: 30007
      native: 30006
  podTemplateSpec:
    metadata: {}
    spec:
      containers:
      - env:
        - name: MEDUSA_MODE
          value: GRPC
        - name: MEDUSA_TMP_DIR
          value: /var/lib/cassandra
        - name: CQL_USERNAME
          valueFrom:
            secretKeyRef:
              key: username
              name: dc-backup-test-medusa
        - name: CQL_PASSWORD
          valueFrom:
            secretKeyRef:
              key: password
              name: dc-backup-test-medusa
        image: docker.io/k8ssandra/medusa:0.13.4
        imagePullPolicy: IfNotPresent
        name: medusa
        ports:
        - containerPort: 50051
          name: grpc
          protocol: TCP
        resources:
          limits:
            memory: 8Gi
          requests:
            cpu: 100m
            memory: 100Mi
        volumeMounts:
        - mountPath: /etc/cassandra
          name: server-config
        - mountPath: /var/lib/cassandra
          name: server-data
        - mountPath: /etc/medusa
          name: dc-backup-test-medusa
        - mountPath: /etc/podinfo
          name: podinfo
        - mountPath: /etc/medusa-secrets
          name: medusa-bucket-key
      - env:
        - name: METRIC_FILTERS
          value: deny:org.apache.cassandra.metrics.Table deny:org.apache.cassandra.metrics.table
            allow:org.apache.cassandra.metrics.table.live_ss_table_count allow:org.apache.cassandra.metrics.Table.LiveSSTableCount
            allow:org.apache.cassandra.metrics.table.live_disk_space_used allow:org.apache.cassandra.metrics.table.LiveDiskSpaceUsed
            allow:org.apache.cassandra.metrics.Table.Pending allow:org.apache.cassandra.metrics.Table.Memtable
            allow:org.apache.cassandra.metrics.Table.Compaction allow:org.apache.cassandra.metrics.table.read
            allow:org.apache.cassandra.metrics.table.write allow:org.apache.cassandra.metrics.table.range
            allow:org.apache.cassandra.metrics.table.coordinator allow:org.apache.cassandra.metrics.table.dropped_mutations
        name: cassandra
        resources: {}
      initContainers:
      - name: server-config-init
        resources: {}
      - env:
        - name: MEDUSA_MODE
          value: RESTORE
        - name: MEDUSA_TMP_DIR
          value: /var/lib/cassandra
        - name: CQL_USERNAME
          valueFrom:
            secretKeyRef:
              key: username
              name: dc-backup-test-medusa
        - name: CQL_PASSWORD
          valueFrom:
            secretKeyRef:
              key: password
              name: dc-backup-test-medusa
        - name: BACKUP_NAME
          value: dc-eu-west-1-07102022
        - name: RESTORE_KEY
          value: fd84d0e4-6ca1-49ec-8fe2-0616ffa07e46
        image: docker.io/k8ssandra/medusa:0.13.4
        imagePullPolicy: IfNotPresent
        name: medusa-restore
        resources:
          limits:
            memory: 8Gi
          requests:
            cpu: 100m
            memory: 100Mi
        volumeMounts:
        - mountPath: /etc/cassandra
          name: server-config
        - mountPath: /var/lib/cassandra
          name: server-data
        - mountPath: /etc/medusa
          name: dc-backup-test-medusa
        - mountPath: /etc/podinfo
          name: podinfo
        - mountPath: /etc/medusa-secrets
          name: medusa-bucket-key
      volumes:
      - configMap:
          name: dc-backup-test-medusa
        name: dc-backup-test-medusa
      - name: medusa-bucket-key
        secret:
          secretName: medusa-bucket-key
      - downwardAPI:
          items:
          - fieldRef:
              fieldPath: metadata.labels
            path: labels
        name: podinfo
  racks:
  - name: rack-a
    nodeAffinityLabels:
      topology.kubernetes.io/zone: eu-west-1a
  - name: rack-b
    nodeAffinityLabels:
      topology.kubernetes.io/zone: eu-west-1b
  - name: rack-c
    nodeAffinityLabels:
      topology.kubernetes.io/zone: eu-west-1c
  resources:
    requests:
      cpu: "1"
      memory: 2Gi
  serverImage: k8ssandra/cass-management-api:4.0.4-v0.1.40
  serverType: cassandra
  serverVersion: 4.0.4
  size: 3
  storageConfig:
    cassandraDataVolumeClaimSpec:
      accessModes:
      - ReadWriteOnce
      resources:
        requests:
          storage: 2Gi
      storageClassName: gp2
  superuserSecretName: dc-backup-test-superuser
  systemLoggerResources: {}
  tolerations:
  - effect: NoSchedule
    key: datanode
    operator: Equal
    value: "true"
  users:
  - secretName: dc-backup-test-medusa
    superuser: true
status:
  cassandraOperatorProgress: Updating
  conditions:
  - lastTransitionTime: "2022-10-07T10:21:53Z"
    message: ""
    reason: ""
    status: "True"
    type: Healthy
  - lastTransitionTime: "2022-10-07T10:30:57Z"
    message: ""
    reason: ""
    status: "False"
    type: Stopped
  - lastTransitionTime: "2022-10-07T10:21:55Z"
    message: ""
    reason: ""
    status: "False"
    type: ReplacingNodes
  - lastTransitionTime: "2022-10-07T10:30:42Z"
    message: ""
    reason: ""
    status: "False"
    type: Updating
  - lastTransitionTime: "2022-10-07T10:21:55Z"
    message: ""
    reason: ""
    status: "False"
    type: RollingRestart
  - lastTransitionTime: "2022-10-07T10:30:57Z"
    message: ""
    reason: ""
    status: "True"
    type: Resuming
  - lastTransitionTime: "2022-10-07T10:21:55Z"
    message: ""
    reason: ""
    status: "False"
    type: ScalingDown
  - lastTransitionTime: "2022-10-07T10:21:55Z"
    message: ""
    reason: ""
    status: "True"
    type: Valid
  - lastTransitionTime: "2022-10-07T10:21:55Z"
    message: ""
    reason: ""
    status: "True"
    type: Initialized
  - lastTransitionTime: "2022-10-07T10:28:27Z"
    message: ""
    reason: ""
    status: "False"
    type: Ready
  lastServerNodeStarted: "2022-10-07T10:31:47Z"
  nodeStatuses:
    dc-backup-test-dc-backup-test-rack-a-sts-0:
      hostID: a483d9d7-a262-4f34-949a-e406973cbade
    dc-backup-test-dc-backup-test-rack-b-sts-0:
      hostID: e271f210-c3ff-425d-9730-f7a8cc2d398c
    dc-backup-test-dc-backup-test-rack-c-sts-0:
      hostID: 38e165a8-c858-490a-8fb2-a393514803e8
  observedGeneration: 3
  quietPeriod: "2022-10-07T10:30:48Z"
  superUserUpserted: "2022-10-07T10:21:56Z"
  usersUpserted: "2022-10-07T10:21:56Z"

and statefulsets are attached: statefulsets.yaml.zip

I can't find the topology file on s3. The list of file is attached: filelist.txt

FYI the test on fresh resources gave the results I expected: created k8ssandra cluster, performed backup, succeded, tried restore, restore procedure started, only one pod managed to restore succesfully, the other pods are crashlooping in container medusa restore:

dc-backup-test-dc-backup-test-rack-a-sts-0                      3/3     Running                 0          13m
dc-backup-test-dc-backup-test-rack-b-sts-0                      0/3     Init:CrashLoopBackOff   7          13m
dc-backup-test-dc-backup-test-rack-c-sts-0                      0/3     Init:CrashLoopBackOff   7          13m

the log is (from rack c bucket) :


ERROR:root:No such backup
[2022-10-07 10:42:32,443] ERROR: No such backup
Mapping: {'in_place': True, 'host_map': {'10.71.28.95': {'source': ['dc-backup-test-dc-backup-test-rack-a-sts-0'], 'seed': False}, 'localhost': {'source': ['10.71.57.183'], 'seed': False}, '10.71.71.44': {'source': ['10.71.71.44'], 'seed': False}}}´´´
adejanovski commented 2 years ago

My bad, the files we're interested in are tokenmap*.json, not topology*.json.

they appear here in your file list:

2022-10-07 12:23:33       1316 cassandra-backup-test/index/backup_index/dc-eu-west-1-07102022/tokenmap_dc-backup-test-dc-backup-test-rack-a-sts-0.json
2022-10-07 12:23:33       1315 cassandra-backup-test/index/backup_index/dc-eu-west-1-07102022/tokenmap_dc-backup-test-dc-backup-test-rack-b-sts-0.json
2022-10-07 12:23:33       1316 cassandra-backup-test/index/backup_index/dc-eu-west-1-07102022/tokenmap_dc-backup-test-dc-backup-test-rack-c-sts-0.json
grassiale commented 2 years ago

tokenmap_dc-backup-test-dc-backup-test-rack-a-sts-0.json:

{"dc-backup-test-dc-backup-test-rack-a-sts-0": {"tokens": [-2120304533898230137, -3997368863411216165, -5817373164391162564, -6881714154164225960, -8140134745307878994, -951349610719215758, 1033687362401490749, 1449325021973130629, 2661005368645111774, 3058328220993824565, 4316748936939267653, 5351049583614672020, 5777336084895801410, 7127537911470479293, 8184914428399437510, 97774189609975329], "is_up": true, "rack": "rack-a", "dc": "dc-backup-test"}, "10.71.57.183": {"tokens": [-1701626850987540579, -2780999658435142236, -3516915351146035876, -458106845306948606, -4592671232371615416, -5410343164983966210, -6428194569726393226, -7499695186066426038, -8939963415969009278, 2099261950285485530, 3718285245982528827, 4696430698498642903, 509257397771586892, 6475064999322230031, 7571106899557168320, 8662404010249380579], "is_up": true, "rack": "rack-b", "dc": "dc-backup-test"}, "10.71.71.44": {"tokens": [-1254055511089660192, -2371293961961357785, -3085589426359018778, -4917505155314101461, -6055631694703996453, -7117984094374347992, -7780115600676589228, -8444624302853847303, 1817668213599905947, 3450149177604169998, 4083133001077304121, 5061023857470792785, 6208347565826469493, 6875278314747410329, 7932834999437458887, 9175273314117591040], "is_up": true, "rack": "rack-c", "dc": "dc-backup-test"}}

tokenmap_dc-backup-test-dc-backup-test-rack-b-sts-0.json:

{"10.71.28.95": {"tokens": [-2120304533898230137, -3997368863411216165, -5817373164391162564, -6881714154164225960, -8140134745307878994, -951349610719215758, 1033687362401490749, 1449325021973130629, 2661005368645111774, 3058328220993824565, 4316748936939267653, 5351049583614672020, 5777336084895801410, 7127537911470479293, 8184914428399437510, 97774189609975329], "is_up": true, "rack": "rack-a", "dc": "dc-backup-test"}, "dc-backup-test-dc-backup-test-rack-b-sts-0": {"tokens": [-1701626850987540579, -2780999658435142236, -3516915351146035876, -458106845306948606, -4592671232371615416, -5410343164983966210, -6428194569726393226, -7499695186066426038, -8939963415969009278, 2099261950285485530, 3718285245982528827, 4696430698498642903, 509257397771586892, 6475064999322230031, 7571106899557168320, 8662404010249380579], "is_up": true, "rack": "rack-b", "dc": "dc-backup-test"}, "10.71.71.44": {"tokens": [-1254055511089660192, -2371293961961357785, -3085589426359018778, -4917505155314101461, -6055631694703996453, -7117984094374347992, -7780115600676589228, -8444624302853847303, 1817668213599905947, 3450149177604169998, 4083133001077304121, 5061023857470792785, 6208347565826469493, 6875278314747410329, 7932834999437458887, 9175273314117591040], "is_up": true, "rack": "rack-c", "dc": "dc-backup-test"}}

tokenmap_dc-backup-test-dc-backup-test-rack-c-sts-0.json:

{"10.71.28.95": {"tokens": [-2120304533898230137, -3997368863411216165, -5817373164391162564, -6881714154164225960, -8140134745307878994, -951349610719215758, 1033687362401490749, 1449325021973130629, 2661005368645111774, 3058328220993824565, 4316748936939267653, 5351049583614672020, 5777336084895801410, 7127537911470479293, 8184914428399437510, 97774189609975329], "is_up": true, "rack": "rack-a", "dc": "dc-backup-test"}, "10.71.57.183": {"tokens": [-1701626850987540579, -2780999658435142236, -3516915351146035876, -458106845306948606, -4592671232371615416, -5410343164983966210, -6428194569726393226, -7499695186066426038, -8939963415969009278, 2099261950285485530, 3718285245982528827, 4696430698498642903, 509257397771586892, 6475064999322230031, 7571106899557168320, 8662404010249380579], "is_up": true, "rack": "rack-b", "dc": "dc-backup-test"}, "dc-backup-test-dc-backup-test-rack-c-sts-0": {"tokens": [-1254055511089660192, -2371293961961357785, -3085589426359018778, -4917505155314101461, -6055631694703996453, -7117984094374347992, -7780115600676589228, -8444624302853847303, 1817668213599905947, 3450149177604169998, 4083133001077304121, 5061023857470792785, 6208347565826469493, 6875278314747410329, 7932834999437458887, 9175273314117591040], "is_up": true, "rack": "rack-c", "dc": "dc-backup-test"}}%
adejanovski commented 2 years ago

yep, that's what I thought. In each of them, you have one hostname and two IP addresses. This means the pods manage to resolve their own IP to a hostname but they can't resolve the other pods IPs to their respective hostnames 🤔

I've checked the statefulsets definition and they have the expected serviceName, so I'm a little puzzled... DNS issues like this aren't easy to debug. Could you check one of the medusa container logs for lines like this? [2022-10-07 01:30:15,938] DEBUG: Resolved 10.64.1.2 to dogfood-dc2-default-sts-2

To do some debugging, you'd need to ssh into one of the medusa containers, run python3 and then try to resolve the IP of another Cassandra pod:

import dns.resolver
import dns.reversename

reverse_name = dns.reversename.from_address(ip_address).to_text()
fqdns = dns.resolver.resolve(reverse_name, 'PTR')
for fqdn in fqdns:
  print(fqdn.to_text())

That should show us which hostnames correspond to the pods IP addresses.

FTR, the prepare restore error in the logs is a red herring, don't worry about it.

grassiale commented 2 years ago

Here are the lines when starting a backup from pod in rack a :

[2022-10-07 13:55:31,152] DEBUG: Checking placement using dc and rack...
INFO:root:Resolving ip address 10.71.12.222
[2022-10-07 13:55:31,152] INFO: Resolving ip address 10.71.12.222
INFO:root:ip address to resolve 10.71.12.222
[2022-10-07 13:55:31,152] INFO: ip address to resolve 10.71.12.222
DEBUG:cassandra.connection:Sending initial options message for new connection (140436372562552) to 10.71.57.183:30006
[2022-10-07 13:55:31,154] DEBUG: Sending initial options message for new connection (140436372562552) to 10.71.57.183:30006
DEBUG:cassandra.connection:Received options response on new connection (140436372562552) from 10.71.57.183:30006
[2022-10-07 13:55:31,156] DEBUG: Received options response on new connection (140436372562552) from 10.71.57.183:30006
DEBUG:cassandra.connection:No available compression types supported on both ends. locally supported: odict_keys([]). remotely supported: ['snappy', 'lz4']
[2022-10-07 13:55:31,157] DEBUG: No available compression types supported on both ends. locally supported: odict_keys([]). remotely supported: ['snappy', 'lz4']
DEBUG:cassandra.connection:Sending StartupMessage on <LibevConnection(140436372562552) 10.71.57.183:30006>
[2022-10-07 13:55:31,157] DEBUG: Sending StartupMessage on <LibevConnection(140436372562552) 10.71.57.183:30006>
DEBUG:cassandra.connection:Sent StartupMessage on <LibevConnection(140436372562552) 10.71.57.183:30006>
[2022-10-07 13:55:31,157] DEBUG: Sent StartupMessage on <LibevConnection(140436372562552) 10.71.57.183:30006>
DEBUG:root:Resolved 10.71.12.222 to dc-backup-test-dc-backup-test-rack-a-sts-0
[2022-10-07 13:55:31,158] DEBUG: Resolved 10.71.12.222 to dc-backup-test-dc-backup-test-rack-a-sts-0
WARNING:cassandra.connection:An authentication challenge was not sent, this is suspicious because the driver expects authentication (configured authenticator = PlainTextAuthenticator)
[2022-10-07 13:55:31,159] WARNING: An authentication challenge was not sent, this is suspicious because the driver expects authentication (configured authenticator = PlainTextAuthenticator)
DEBUG:cassandra.connection:Got ReadyMessage on new connection (140436372562552) from 10.71.57.183:30006
[2022-10-07 13:55:31,159] DEBUG: Got ReadyMessage on new connection (140436372562552) from 10.71.57.183:30006
DEBUG:cassandra.connection:Enabling protocol checksumming on connection (140436372562552).
[2022-10-07 13:55:31,159] DEBUG: Enabling protocol checksumming on connection (140436372562552).
DEBUG:cassandra.pool:Finished initializing connection for host 10.71.57.183:30006
[2022-10-07 13:55:31,159] DEBUG: Finished initializing connection for host 10.71.57.183:30006
DEBUG:cassandra.cluster:Added pool for host 10.71.57.183:30006 to session
[2022-10-07 13:55:31,160] DEBUG: Added pool for host 10.71.57.183:30006 to session
DEBUG:root:Checking host 10.71.12.222 against 10.71.12.222/dc-backup-test-dc-backup-test-rack-a-sts-0
[2022-10-07 13:55:31,159] DEBUG: Checking host 10.71.12.222 against 10.71.12.222/dc-backup-test-dc-backup-test-rack-a-sts-0
INFO:root:Resolving ip address 10.71.12.222
[2022-10-07 13:55:31,160] INFO: Resolving ip address 10.71.12.222
INFO:root:ip address to resolve 10.71.12.222
[2022-10-07 13:55:31,160] INFO: ip address to resolve 10.71.12.222
DEBUG:root:Resolved 10.71.12.222 to dc-backup-test-dc-backup-test-rack-a-sts-0
[2022-10-07 13:55:31,165] DEBUG: Resolved 10.71.12.222 to dc-backup-test-dc-backup-test-rack-a-sts-0
INFO:root:Resolving ip address 10.71.57.183
[2022-10-07 13:55:31,165] INFO: Resolving ip address 10.71.57.183
INFO:root:ip address to resolve 10.71.57.183
[2022-10-07 13:55:31,166] INFO: ip address to resolve 10.71.57.183
DEBUG:root:Resolved 10.71.57.183 to 10.71.57.183
[2022-10-07 13:55:31,169] DEBUG: Resolved 10.71.57.183 to 10.71.57.183
INFO:root:Resolving ip address 10.71.71.44
[2022-10-07 13:55:31,169] INFO: Resolving ip address 10.71.71.44
INFO:root:ip address to resolve 10.71.71.44
[2022-10-07 13:55:31,169] INFO: ip address to resolve 10.71.71.44
DEBUG:root:Resolved 10.71.71.44 to 10.71.71.44
[2022-10-07 13:55:31,172] DEBUG: Resolved 10.71.71.44 to 10.71.71.44

and from pod in rack c:

INFO:root:Resolving ip address 10.71.93.1
[2022-10-07 13:55:31,052] INFO: Resolving ip address 10.71.93.1
DEBUG:cassandra.connection:No available compression types supported on both ends. locally supported: odict_keys([]). remotely supported: ['snappy', 'lz4']
[2022-10-07 13:55:31,052] DEBUG: No available compression types supported on both ends. locally supported: odict_keys([]). remotely supported: ['snappy', 'lz4']
INFO:root:ip address to resolve 10.71.93.1
[2022-10-07 13:55:31,052] INFO: ip address to resolve 10.71.93.1
DEBUG:cassandra.connection:Sending StartupMessage on <LibevConnection(139904893611144) 10.71.57.183:30006>
[2022-10-07 13:55:31,053] DEBUG: Sending StartupMessage on <LibevConnection(139904893611144) 10.71.57.183:30006>
DEBUG:cassandra.connection:Sent StartupMessage on <LibevConnection(139904893611144) 10.71.57.183:30006>
[2022-10-07 13:55:31,053] DEBUG: Sent StartupMessage on <LibevConnection(139904893611144) 10.71.57.183:30006>
DEBUG:root:Resolved 10.71.93.1 to dc-backup-test-dc-backup-test-rack-c-sts-0
[2022-10-07 13:55:31,057] DEBUG: Resolved 10.71.93.1 to dc-backup-test-dc-backup-test-rack-c-sts-0
DEBUG:root:Checking host 10.71.93.1 against 10.71.93.1/dc-backup-test-dc-backup-test-rack-c-sts-0
[2022-10-07 13:55:31,057] DEBUG: Checking host 10.71.93.1 against 10.71.93.1/dc-backup-test-dc-backup-test-rack-c-sts-0
INFO:root:Resolving ip address 10.71.28.95
[2022-10-07 13:55:31,057] INFO: Resolving ip address 10.71.28.95
INFO:root:ip address to resolve 10.71.28.95
[2022-10-07 13:55:31,057] INFO: ip address to resolve 10.71.28.95
DEBUG:root:Resolved 10.71.28.95 to 10.71.28.95
[2022-10-07 13:55:31,059] DEBUG: Resolved 10.71.28.95 to 10.71.28.95
INFO:root:Resolving ip address 10.71.57.183
[2022-10-07 13:55:31,060] INFO: Resolving ip address 10.71.57.183
INFO:root:ip address to resolve 10.71.57.183
[2022-10-07 13:55:31,060] INFO: ip address to resolve 10.71.57.183
WARNING:cassandra.connection:An authentication challenge was not sent, this is suspicious because the driver expects authentication (configured authenticator = PlainTextAuthenticator)
[2022-10-07 13:55:31,062] WARNING: An authentication challenge was not sent, this is suspicious because the driver expects authentication (configured authenticator = PlainTextAuthenticator)
DEBUG:cassandra.connection:Got ReadyMessage on new connection (139904893611144) from 10.71.57.183:30006
[2022-10-07 13:55:31,062] DEBUG: Got ReadyMessage on new connection (139904893611144) from 10.71.57.183:30006
DEBUG:cassandra.connection:Enabling protocol checksumming on connection (139904893611144).
[2022-10-07 13:55:31,062] DEBUG: Enabling protocol checksumming on connection (139904893611144).
DEBUG:cassandra.pool:Finished initializing connection for host 10.71.57.183:30006
[2022-10-07 13:55:31,062] DEBUG: Finished initializing connection for host 10.71.57.183:30006
DEBUG:cassandra.cluster:Added pool for host 10.71.57.183:30006 to session
[2022-10-07 13:55:31,062] DEBUG: Added pool for host 10.71.57.183:30006 to session
DEBUG:root:Resolved 10.71.57.183 to 10.71.57.183
[2022-10-07 13:55:31,066] DEBUG: Resolved 10.71.57.183 to 10.71.57.183
INFO:root:Resolving ip address 10.71.93.1
[2022-10-07 13:55:31,066] INFO: Resolving ip address 10.71.93.1
INFO:root:ip address to resolve 10.71.93.1
[2022-10-07 13:55:31,066] INFO: ip address to resolve 10.71.93.1
DEBUG:root:Resolved 10.71.93.1 to dc-backup-test-dc-backup-test-rack-c-sts-0
[2022-10-07 13:55:31,072] DEBUG: Resolved 10.71.93.1 to dc-backup-test-dc-backup-test-rack-c-sts-0
grassiale commented 2 years ago

And the resolution test from rack c medusa container of pod in rack a:

Type "help", "copyright", "credits" or "license" for more information.
>>> import dns.resolver
>>> import dns.reversename
>>>
>>> reverse_name = dns.reversename.from_address("10.71.12.222").to_text()
>>> fqdns = dns.resolver.resolve(reverse_name, 'PTR')
>>> for fqdn in fqdns:
...   print(fqdn.to_text())
...
10-71-12-222.dc-backup-test-dc-backup-test-service.k8ssandra.svc.cluster.local.
10-71-12-222.dc-backup-test-seed-service.k8ssandra.svc.cluster.local.
10-71-12-222.dc-backup-test-dc-backup-test-additional-seed-service.k8ssandra.svc.cluster.local.
dc-backup-test-dc-backup-test-rack-a-sts-0.dc-backup-test-dc-backup-test-all-pods-service.k8ssandra.svc.cluster.local.
10-71-12-222.dc-backup-test-dc-backup-test-node-port-service.k8ssandra.svc.cluster.local.

In case you need it, also a test from node in rack a resolving node in rack c:

10-71-93-1.dc-backup-test-dc-backup-test-service.k8ssandra.svc.cluster.local.
10-71-93-1.dc-backup-test-dc-backup-test-node-port-service.k8ssandra.svc.cluster.local.
dc-backup-test-dc-backup-test-rack-c-sts-0.dc-backup-test-dc-backup-test-all-pods-service.k8ssandra.svc.cluster.local.
10-71-93-1.dc-backup-test-seed-service.k8ssandra.svc.cluster.local.
10-71-93-1.dc-backup-test-dc-backup-test-additional-seed-service.k8ssandra.svc.cluster.local.
grassiale commented 2 years ago

I'm still available for further tests if needed ;)

dnugmanov commented 2 years ago

@grassiale hi, which version of coredns is running?

grassiale commented 2 years ago

602401143452.dkr.ecr.eu-west-1.amazonaws.com/eks/coredns:v1.8.4-eksbuild.1