crossplane-contrib / provider-confluent

Crossplane upjet provider for Confluent Cloud: https://registry.terraform.io/providers/confluentinc/confluent/latest/docs
Apache License 2.0
6 stars 7 forks source link

kafkatopics fails to resync after the controller restart #18

Closed drneo-mehdi closed 1 month ago

drneo-mehdi commented 3 months ago

What happened?

provider-confluent fails on resyncing the objects after the controller gets restarted.

How can we reproduce it?

When you apply the kafkatopic object it works perfectly fine, for instance i'm using this file `apiVersion: confluent.crossplane.io/v1alpha1 kind: KafkaTopic metadata: name: mehditest-please-5 spec: forProvider: credentials:

after applying above it works and the object is ready and synced but if you restart the controller by k rollout restart deployment [provider-confluent-deployment]

the controller fails to sync the objects. here is the example of one: k get kafkatopics.confluent.crossplane.io mehditest-please-3 NAME READY SYNCED EXTERNAL-NAME AGE mehditest-please-3 True False lkc-ID/mehditest-please-3 14m

And if you describe it Warning CannotObserveExternalResource 32s (x18 over 12m) managed/confluent.crossplane.io/v1alpha1, kind=kafkatopic cannot run refresh: refresh failed: error reading Kafka Topic: one of (provider.kafka_api_key, provider.kafka_api_secret), (KAFKA_API_KEY, KAFKA_API_SECRET environment variables) or (resource.credentials.key, resource.credentials.secret) must be set:

by further investigation on the pod i can confirm the tf file doesn't have the credentials set.

What environment did it happen in?

Crossplane version: crossplane-1.16.0 Crossplane Provider confluent version: 0.5

dmvariphy commented 3 months ago

I'm seeing the same thing, but with KafkaACL resources when running e2e tests with uptest. uptest seems to go through four stages: apply, update, import, delete. During the import stage, Crossplane deployments are restarted (scaled down and up again). Once uptest goes on to the delete stage, I see the the cannot run refresh: refresh failed error:

cannot run refresh: refresh failed: error reading Kafka ACLs: one of (provider.kafka_api_key, provider.kafka_api_secret), (KAFKA_API_KEY, KAFKA_API_SECRET environment variables) or (resource.credentials.key, resource.credentials.secret) must be set:
apiVersion: confluent.crossplane.io/v1alpha1
kind: KafkaACL
# ...
spec:
  deletionPolicy: Delete
  forProvider:
    credentials:
      - keySecretRef:
          key: api_key_id
          name: kafka-basic-dev-creds
          namespace: crossplane-system
        secretSecretRef:
          key: api_key_secret
          name: kafka-basic-dev-creds
          namespace: crossplane-system
    host: '*'
    kafkaCluster:
      - id: ID-REDACTED
    operation: READ
    patternType: LITERAL
    permission: ALLOW
    principal: User:SA-REDACTED
    resourceName: RESOURCE-NAME-REDACTED
    resourceType: GROUP
    restEndpoint: https://CLUSTER-REDACTED.confluent.cloud:443
  initProvider: {}
  managementPolicies:
    - '*'
  providerConfigRef:
    name: default
status:
  atProvider:
    credentials:
      - keySecretRef:
          key: ""
          name: ""
          namespace: ""
        secretSecretRef:
          key: ""
          name: ""
          namespace: ""
    host: '*'
    id: REDACTED
    kafkaCluster:
      - id: CLUSTER-REDACTED
    operation: READ
    patternType: LITERAL
    permission: ALLOW
    principal: User:SA-REDACTED
    resourceName: RESOURCE-NAME-REDACTED
    resourceType: GROUP
    restEndpoint: https://CLUSTER-REDACTED.confluent.cloud:443
  conditions:
    - lastTransitionTime: "2024-07-01T15:40:01Z"
      reason: Available
      status: "True"
      type: Ready
    - lastTransitionTime: "2024-07-01T15:43:10Z"
      message: 'observe failed: cannot run refresh: refresh failed: error reading Kafka ACLs: one of (provider.kafka_api_key, provider.kafka_api_secret), (KAFKA_API_KEY, KAFKA_API_SECRET environment variables) or (resource.credentials.key, resource.credentials.secret) must be set: '
      reason: ReconcileError
      status: "False"
      type: Synced
    - lastTransitionTime: "2024-07-01T15:39:58Z"
      reason: Success
      status: "True"
      type: LastAsyncOperation
    - lastTransitionTime: "2024-07-01T15:39:58Z"
      reason: Finished
      status: "True"
      type: AsyncOperation
    - lastTransitionTime: "2024-07-01T15:40:02Z"
      reason: UpToDate
      status: "True"
      type: Test