Configuration-based installation of OpenShift and Cloud Pak for Data/Integration/Watson AIOps on various private and public cloud infrastructure providers. Deployment attempts to achieve the end-state defined in the configuration. If something fails along the way, you only need to restart the process to continue the deployment.
Problem Description:
When deleting a namespace, the system may hang indefinitely due to finalizers not being removed automatically, even when all content is successfully deleted. This leads to namespaces being stuck in a terminating state. The Kubernetes/Oc API often shows the following message indicating that finalization is pending:
{
"type": "NamespaceDeletionContentFailure",
"status": "False",
"lastTransitionTime": "2024-10-15T09:08:43Z",
"reason": "ContentDeleted",
"message": "All content successfully deleted, may be waiting on finalization"
}
Proposed Solution:
Implement an automatic deletion of the finalizer after a 5-minute timeout if the system is waiting on finalization after content has been confirmed as deleted.
Desired Outcome:
After a 5-minute timeout, if the deletion process is waiting on finalization and all content has been deleted, the finalizer should be automatically removed to allow the namespace deletion to proceed.
This prevents namespaces from being stuck in a terminating state and enhances system responsiveness.
Key Details for Implementation:
Timeout Logic:
After detecting the status "reason": "ContentDeleted", start a 5-minute countdown.
If the finalizer is still present after the 5 minutes, proceed with its deletion.
Safety and Error Handling:
Ensure that all necessary resources have been properly deleted before finalizer removal.
Provide logging and auditing for finalizer removal, ensuring it is traceable in case of issues.
Custom Resource Definition (CRD) Updates:
Modify the current instance handling to include a finalizer removal mechanism after the timeout.
Configuration Options:
Provide an option to disable automatic finalizer removal if stricter control is needed.
Expected Impact:
Reduced number of namespaces stuck in terminating states.
Problem Description: When deleting a namespace, the system may hang indefinitely due to finalizers not being removed automatically, even when all content is successfully deleted. This leads to namespaces being stuck in a terminating state. The Kubernetes/Oc API often shows the following message indicating that finalization is pending:
Proposed Solution: Implement an automatic deletion of the finalizer after a 5-minute timeout if the system is waiting on finalization after content has been confirmed as deleted.
Desired Outcome:
Key Details for Implementation:
Timeout Logic:
"reason": "ContentDeleted"
, start a 5-minute countdown.Safety and Error Handling:
Custom Resource Definition (CRD) Updates:
Configuration Options:
Expected Impact: