Closed deadjoker closed 3 years ago
@deadjoker It may be caused by broken data, you can try to delete the tc and reinstall it, please make sure that the PVs do not have discarded data.
@DanielZhangQD the PVs were created when the pod started so that they were clean directories. After the failed of pod, I checked the dirs on ceph and found that the 'pd/member/snap/db' was created but the log said the file was broken because of the error "value too large for defined data type".
Besides, I tried reinstall it but had no luck as well.
@DanielZhangQD What disappointed me is that I set the replica to 3 and 2 of the pods running well. Only the 3rd pod cannot start successfully.
@deadjoker OK, understand that. cc @dragonly Please help follow up this issue with PD team, thanks!
@DanielZhangQD @dragonly our test environment is tidb-operator 1.1.7, k8s 1.18.9, docker 19.03.5 ceph 15.2.5 ceph-csi 3.1.0
@deadjoker hi, could you please do the following things, make sure that the issue is reproduceable:
TidbCluster
and all related PVCs, for example kubectl delete tc --all -n ${tidb-cluster-ns} && kubectl delete pvc --all -n ${tidb-cluster-ns}
TidbCluster
using the original yaml@dragonly I tried create new cluster using the original yaml, it succeeded. And then I created the cluster with localstorage storageclass successfully as well.
@deadjoker :+1: Feel free to report issue here if any goes wrong again.
PTAL at the potential PD issue here @Yisaer , as reported in the log
[2020/11/24 06:58:36.918 +00:00] [PANIC] [backend.go:157] ["failed to open database"] [path=/var/lib/pd/member/snap/db] [error="value too large for defined data type"]
Question
I'm using cephfs storageclass for pvc in k8s. The pd server failed to start when I attached a cephfs pvc in the pod. here is the log:
The storageclass is well worked on other service pod. What's the issue?