apecloud / kubeblocks

KubeBlocks is an open-source control plane software that runs and manages databases, message queues and other stateful applications on K8s.
https://kubeblocks.io
GNU Affero General Public License v3.0
2.08k stars 169 forks source link

[Features] kubeblocks cluster allow diagnostic mode for error investigation #450

Closed nashtsai closed 3 months ago

nashtsai commented 1 year ago

Is your feature request related to a problem? Please describe. If cluster's liveness probing has falied it will cause the pod to be recreated, this might not be OK if failover pod is having the same problem, it would be nice to have diagnostic mode for investigation.

If this is a new feature, please describe the motivation and goals. The motivation would be to allow investigation while error is persisting but yet identified, stopping probes failure events and alerts plus allow disagnotic intervention to happen, i.e., kubectl debug pod ....

To get technical details about this, we may have following scenarios:

  1. Filesystem corruption casued DB server restarting, where I would kind of 'Pause' diagnostic mode, where the pod is still running without have actual DB server process running, and this allows me to debug into the pod and human exam and attempt to fix the file corruption, just try to scrape all the necessary files (core dumps, configs) for root cause investigation.

Describe the solution you'd like Cluster API allow diagnostic mode at component-pod level.

Describe alternatives you've considered Integration of troubleshooting tools with ISV provided knowledges.

github-actions[bot] commented 1 year ago

This issue has been marked as stale because it has been open for 30 days with no activity

nayutah commented 3 months ago

Completed in 0.9.0 https://github.com/apecloud/kubeblocks/pull/7435