Open vladbrk opened 6 years ago
I had the same issue with OSD nodes. After deleting (kubectl delete pod -n ceph ceph-osd-dev-...
) all OSD pods the cluster came up nicely.
For GKE, you should ssh the osd VM instances and unmount the local SSD disks Make sure you're using Ubuntu Image Type & Turn off liveness-readiness-probes
@neuhalje After deleting ceph-osd-dev-... I've got rid from "FailedMount ... MountVolume.SetUp failed for volume ..." but ceph-osd-dev pods still Init:CrashLoopBackOff
kubectl describe -n ceph pod ceph-osd-dev-sdb-mdplb
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Created 1m kubelet, gke-standard-cluster-1-default-pool-732c77d1-889q Created container
Normal SuccessfulMountVolume 1m kubelet, gke-standard-cluster-1-default-pool-732c77d1-889q MountVolume.SetUp succeeded for volume "run-udev"
Normal SuccessfulMountVolume 1m kubelet, gke-standard-cluster-1-default-pool-732c77d1-889q MountVolume.SetUp succeeded for volume "pod-var-lib-ceph"
Normal SuccessfulMountVolume 1m kubelet, gke-standard-cluster-1-default-pool-732c77d1-889q MountVolume.SetUp succeeded for volume "ceph-etc"
Normal SuccessfulMountVolume 1m kubelet, gke-standard-cluster-1-default-pool-732c77d1-889q MountVolume.SetUp succeeded for volume "ceph-bootstrap-osd-keyring"
Normal SuccessfulMountVolume 1m kubelet, gke-standard-cluster-1-default-pool-732c77d1-889q MountVolume.SetUp succeeded for volume "ceph-bootstrap-rgw-keyring"
Normal SuccessfulMountVolume 1m kubelet, gke-standard-cluster-1-default-pool-732c77d1-889q MountVolume.SetUp succeeded for volume "default-token-9mmv7"
Normal SuccessfulMountVolume 1m kubelet, gke-standard-cluster-1-default-pool-732c77d1-889q MountVolume.SetUp succeeded for volume "pod-run"
Normal SuccessfulMountVolume 1m kubelet, gke-standard-cluster-1-default-pool-732c77d1-889q MountVolume.SetUp succeeded for volume "ceph-mon-keyring"
Normal SuccessfulMountVolume 1m (x3 over 1m) kubelet, gke-standard-cluster-1-default-pool-732c77d1-889q (combined from similar events): MountVolume.SetUp succeeded for volume "ceph-bin"
Normal Pulled 1m kubelet, gke-standard-cluster-1-default-pool-732c77d1-889q Container image "docker.io/kolla/ubuntu-source-kubernetes-entrypoint:4.0.0" already present on machine
Normal SuccessfulMountVolume 1m kubelet, gke-standard-cluster-1-default-pool-732c77d1-889q MountVolume.SetUp succeeded for volume "devices"
Normal Started 1m kubelet, gke-standard-cluster-1-default-pool-732c77d1-889q Started container
Normal Pulled 1m kubelet, gke-standard-cluster-1-default-pool-732c77d1-889q Container image "docker.io/ceph/daemon:tag-build-master-luminous-ubuntu-16.04" already present on machine
Normal Created 1m kubelet, gke-standard-cluster-1-default-pool-732c77d1-889q Created container
Normal Started 1m kubelet, gke-standard-cluster-1-default-pool-732c77d1-889q Started container
Normal Pulled 1m (x2 over 1m) kubelet, gke-standard-cluster-1-default-pool-732c77d1-889q Container image "docker.io/ceph/daemon:tag-build-master-luminous-ubuntu-16.04" already present on machine
Normal Created 1m (x2 over 1m) kubelet, gke-standard-cluster-1-default-pool-732c77d1-889q Created container
Normal Started 1m (x2 over 1m) kubelet, gke-standard-cluster-1-default-pool-732c77d1-889q Started container
Warning BackOff 1m kubelet, gke-standard-cluster-1-default-pool-732c77d1-889q Back-off restarting failed container
But still I decided go further and continued build cluster using script below
kubectl -n ceph exec -ti ceph-mon-9nz8x -c ceph-mon -- bash
# QVFDbkR6NWN0MjkyTlJBQXgyWmlNRGF6SzF5OW9idVhNRXlNNFE9PQo=
exit
kubectl -n ceph edit secrets/pvc-ceph-client-key
# manually add key
#apiVersion: v1
#data:
# key: QVFDbkR6NWN0MjkyTlJBQXgyWmlNRGF6SzF5OW9idVhNRXlNNFE9PQo=
kubectl -n ceph get secrets/pvc-ceph-client-key -o json | jq '.metadata.namespace = "default"' | kubectl create -f -
kubectl get secrets
kubectl -n ceph exec -ti ceph-mon-9nz8x -c ceph-mon -- ceph osd pool create rbd 128
kubectl -n ceph exec -ti ceph-mon-9nz8x -c ceph-mon -- rbd pool init rbd
cat <<EOF> pvc-rbd.yaml
kind: PersistentVolumeClaim
apiVersion: v1
metadata:
name: ceph-pvc
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 10Gi
storageClassName: ceph-rbd
EOF
kubectl create -f pvc-rbd.yaml
And it doesnot works because ceph-pvc is pending
kubectl get pvc
> NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
> ceph-pvc Pending ceph-rbd 7d
kubectl describe pvc ceph-pvc
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal ExternalProvisioning 4m (x237324 over 6d) persistentvolume-controller waiting for a volume to be created, either by external provisioner "ceph.com/rbd" or manually created by system administrator
How did you added user and secrets to ceph cluster?
I followed same procedure but in "make" it show no secrets found in helm-toolkit.. hence after doing helm install steps the ceph-mon pod is going in crashloopbackoff stage and in log it states that it has not found various keyrings required.
@jlhanrey I got the same issue. And I have already done everything. But still not working.
I'm trying deploy Ceph on GKE(k8s) using ceph-helm, but after run "helm install ..." osd pod can't be created due to error "MountVolume.SetUp failed for volume ..." Full error see below
GKE env: 2 nodes: n1-standart-1 (1 virtual CPU, 3.75Gb RAM, hdd 100Gb, 2 mounted ssd 375Gb) kubernetes: 1.9.7-gke.6 or 1.10.7-gke.6
I use this instruction (my script see below) http://docs.ceph.com/docs/mimic/start/kube-helm/ but it fails on command "helm install ..."
kubectl get pods -n ceph
Describe failed osd pod shows:
Command "lsblk -f" on both nodes shows
Command "gdisk -l /dev/sdb" (and sdc) on osd nodes shows
Commands below show no error kubectl logs -n ceph pod/ceph-mon-xjjs4 -c ceph-mon | grep error kubectl logs -n ceph pod/ceph-mon-xjjs4 -c cluster-log-tailer | grep error kubectl logs -n ceph pod/ceph-mon-xjjs4 -c cluster-audit-log-tailer | grep error
Output "kubectl logs -n ceph pod/ceph-mon-xjjs4 -c ceph-mon"
My script for ceph installation
Ps I had read this articles, but it looks like aren't my case https://github.com/ceph/ceph-helm/issues/55 https://github.com/ceph/ceph-helm/issues/51 https://github.com/ceph/ceph-helm/issues/48 https://github.com/ceph/ceph-helm/issues/45
And if smb knows other way (tested and working) to deploy Ceph on K8s, tell me please