[Features] kubeblocks cluster allow diagnostic mode for error investigation

nashtsai commented 2 years ago

Is your feature request related to a problem? Please describe. If cluster's liveness probing has falied it will cause the pod to be recreated, this might not be OK if failover pod is having the same problem, it would be nice to have diagnostic mode for investigation.

If this is a new feature, please describe the motivation and goals. The motivation would be to allow investigation while error is persisting but yet identified, stopping probes failure events and alerts plus allow disagnotic intervention to happen, i.e., kubectl debug pod ....

To get technical details about this, we may have following scenarios:

Filesystem corruption casued DB server restarting, where I would kind of 'Pause' diagnostic mode, where the pod is still running without have actual DB server process running, and this allows me to debug into the pod and human exam and attempt to fix the file corruption, just try to scrape all the necessary files (core dumps, configs) for root cause investigation.

Describe the solution you'd like Cluster API allow diagnostic mode at component-pod level.

Describe alternatives you've considered Integration of troubleshooting tools with ISV provided knowledges.

github-actions[bot] commented 1 year ago

This issue has been marked as stale because it has been open for 30 days with no activity

nayutah commented 5 months ago

Completed in 0.9.0 https://github.com/apecloud/kubeblocks/pull/7435

apecloud / kubeblocks

[Features] kubeblocks cluster allow diagnostic mode for error investigation #450