pingcap / docs-tidb-operator

Documentation for TiDB on Kubernetes in both English and Chinese.
https://docs.pingcap.com/tidb-in-kubernetes
Other
46 stars 118 forks source link

Disaster recovery documentation for k8s deployment #246

Closed tirsen closed 4 years ago

tirsen commented 4 years ago

Current disaster recovery documentation is very centered around an Ansible deployment. We need better docs for a disaster recovery on k8s.

DanielZhangQD commented 4 years ago

@tirsen Please check if the doc here is what you're looking for. BTW, could you please update the doc link for the disaster recovery documentation centered around the Ansible deployment?

tirsen commented 4 years ago

Essentially I would need "ports" of the docs below for how to run these things on Kubernetes. It's not very easy! I think you might need to include pd-recover and tikv-ctl in the default images as well as working out methods for how to run them well on a Kubernetes deployment. Me and @tennix spent a lot of time trying to get it work and it was not a very smooth experience.

https://pingcap.com/docs/stable/reference/tools/pd-recover/ https://tikv.org/docs/3.0/reference/tools/tikv-ctl/#force-region-to-recover-the-service-from-failure-of-multiple-replicas

The doc you linked is good but it doesn't cover all the cases especially not when you've lost disks.

DanielZhangQD commented 4 years ago

The doc for pd-recover is in review in PR https://github.com/pingcap/docs-tidb-operator/pull/245, the English version will be submitted when the Chinese version is merged.

The doc for tikv-ctl usage on Kubernetes is here.

I have created issues to include the tools binaries into the image: pd-recover: https://github.com/pingcap/pd/issues/2402 tikv-ctl: https://github.com/tikv/tikv/issues/7756

Currently, I have added the procedure to download the binaries in the pd-recover doc. @tirsen

DanielZhangQD commented 4 years ago

Chinese doc: https://github.com/pingcap/docs-tidb-operator/pull/245 English doc: https://github.com/pingcap/docs-tidb-operator/pull/250

tirsen commented 4 years ago

Wow this is really useful documentation and so quickly delivered. Thanks!