karmada-io / karmada

Open, Multi-Cloud, Multi-Cluster Kubernetes Orchestration
https://karmada.io
Apache License 2.0
4.37k stars 865 forks source link

CRB says to look at status.aggregatedStatus but that does not exist #5241

Closed grosser closed 1 month ago

grosser commented 1 month ago

What happened:

  conditions:
  - lastTransitionTime: "2024-07-19T05:03:32Z"
    message: Binding has been scheduled successfully.
    reason: Success
    status: "True"
    type: Scheduled
  - lastTransitionTime: "2024-07-19T20:53:45Z"
    message: Failed to apply all works, see status.aggregatedStatus for details
    reason: FullyAppliedFailed
    status: "False"
    type: FullyApplied

but there is no status.aggregatedStatus on either the CRB or the CPP

after restarting the controller-manager it looks like this:

Status:
  Conditions:
    Last Transition Time:         2024-07-19T05:03:32Z
    Message:                      Binding has been scheduled successfully.
    Reason:                       Success
    Status:                       True
    Type:                         Scheduled
    Last Transition Time:         2024-07-19T20:53:45Z
    Message:                      Failed to apply all works, see status.aggregatedStatus for details
    Reason:                       FullyAppliedFailed
    Status:                       False
    Type:                         FullyApplied
  Last Scheduled Time:            2024-07-22T04:41:55Z
  Scheduler Observed Generation:  2
Events:
  Type    Reason           Age   From                                 Message
  ----    ------           ----  ----                                 -------
  Normal  SyncWorkSucceed  31s   cluster-resource-binding-controller  Sync work of clusterResourceBinding(workload-migration-integration-yizhang-namespace) successful.

... so this is just some old status that never got updated ?

the work also appeared after restarting the controller

Also found a few where this actually worked and looks like this:

  Aggregated Status:
    Applied Message:  Failed to apply all manifests (0/1): the informer of cluster(sandbox-eks) has not been initialized
    Cluster Name:     sandbox-eks
    Health:           Unknown

so whatever cleans up the "Aggregated Status" needs to also clean up the status conditions

What you expected to happen:

to see an error message

How to reproduce it (as minimally and precisely as possible):

unclear since I don't know why the work is not getting created

Anything else we need to know?:

Environment:

whitewindmills commented 1 month ago

can you paste karmada-controller-manager logs here?

a7i commented 1 month ago

fixed in #5252

The original issue "it created a CRB, but no work" is in fact Work is immediately created but then later identified as orphaned and deleted.