Since the separation of Manifest state from Module state we had multiple occasions where SRE were alerted about Manifests being stuck in deletion. The reason for being stuck in Deleting state is that the related ModuleCR is in Warning state indicating that the end user is required to perform cleanups before progressing with the deletion.
While we don't want to introduce a dedicate state for this, we want to give the possibility to filter for this situation in alerting and when looking at the Manifest.
Reasons
Have an indicator that Deletion is blocked by required user interaction so that SRE can filter for this in their alerting.
Acceptance Criteria
[ ] Confirmed Key Assumption that module teams set DefaultCR to state "Warning" when deletion is blocked due to waiting for user action
[ ] Feature covered by the E2E test
Feature Testing
When ManifestCR.Status.State is Deleting
And ModuleCR.Status.State is Warning
Then ManifestCR.Status.Conditions includes
- lastTransitionTime: <time>
message: "Module CR is in Warning state"
observedGeneration: <gen>
reason: "Warning"
status: "True"
type: "ModuleCRWarning"
Then metric lifecycle_mgr_module_condition{module_name="<>", kyma_name="<>", condition="moduleCRWarning"} is written with value 1
When ManifestCR is deleted
Then lifecycle_mgr_module_condition{module_name="<>", kyma_name="<>", condition="moduleCRWarning"} removed (either set to 0 or completely deleted. Check what we do with lifecycle_mgr_module_state and do the same)
Description
Since the separation of Manifest state from Module state we had multiple occasions where SRE were alerted about Manifests being stuck in deletion. The reason for being stuck in Deleting state is that the related ModuleCR is in Warning state indicating that the end user is required to perform cleanups before progressing with the deletion.
While we don't want to introduce a dedicate state for this, we want to give the possibility to filter for this situation in alerting and when looking at the Manifest.
Reasons
Have an indicator that Deletion is blocked by required user interaction so that SRE can filter for this in their alerting.
Acceptance Criteria
Feature Testing
When
ManifestCR.Status.State
isDeleting
AndModuleCR.Status.State
isWarning
ThenManifestCR.Status.Conditions
includesThen metric
lifecycle_mgr_module_condition{module_name="<>", kyma_name="<>", condition="moduleCRWarning"}
is written with value 1 When ManifestCR is deleted Thenlifecycle_mgr_module_condition{module_name="<>", kyma_name="<>", condition="moduleCRWarning"}
removed (either set to 0 or completely deleted. Check what we do withlifecycle_mgr_module_state
and do the same)Attachments