gluster / gluster-kubernetes

GlusterFS Native Storage Service for Kubernetes
Apache License 2.0
875 stars 389 forks source link

GlusterFS pods start but gk-deploy still timed out. #533

Open DineshC001 opened 5 years ago

DineshC001 commented 5 years ago

I ran the script with the below command: ./gk-deploy -gvy -n glusterfs Below is the output for the command: Using Kubernetes CLI.

Checking status of namespace matching 'glusterfs': Flag --show-all has been deprecated, will be removed in an upcoming release glusterfs Active 18m Using namespace "glusterfs". Checking for pre-existing resources... GlusterFS pods ... Checking status of pods matching '--selector=glusterfs=pod': Flag --show-all has been deprecated, will be removed in an upcoming release No resources found. Timed out waiting for pods matching '--selector=glusterfs=pod'. not found. deploy-heketi pod ... Checking status of pods matching '--selector=deploy-heketi=pod': Flag --show-all has been deprecated, will be removed in an upcoming release No resources found. Timed out waiting for pods matching '--selector=deploy-heketi=pod'. not found. heketi pod ... Checking status of pods matching '--selector=heketi=pod': Flag --show-all has been deprecated, will be removed in an upcoming release No resources found. Timed out waiting for pods matching '--selector=heketi=pod'. not found. Creating initial resources ... /usr/local/bin/kubectl -n glusterfs create -f /Users/a9405976/Documents/git/glusterfs/gluster-latest/gluster-kubernetes-1.2.0/deploy/kube-templates/heketi-service-account.yaml 2>&1 serviceaccount/heketi-service-account created /usr/local/bin/kubectl -n glusterfs create clusterrolebinding heketi-sa-view --clusterrole=edit --serviceaccount=glusterfs:heketi-service-account 2>&1 clusterrolebinding.rbac.authorization.k8s.io/heketi-sa-view created /usr/local/bin/kubectl -n glusterfs label --overwrite clusterrolebinding heketi-sa-view glusterfs=heketi-sa-view heketi=sa-view clusterrolebinding.rbac.authorization.k8s.io/heketi-sa-view labeled OK Marking 'dev-p04-app-01' as a GlusterFS node. /usr/local/bin/kubectl -n glusterfs label nodes dev-p54-app-01 storagenode=glusterfs 2>&1 node/dev-p04-app-01 labeled Marking 'dev-p04-app-02' as a GlusterFS node. /usr/local/bin/kubectl -n glusterfs label nodes dev-p54-app-02 storagenode=glusterfs 2>&1 node/dev-p04-app-02 labeled Deploying GlusterFS pods. sed -e 's/storagenode\: glusterfs/storagenode\: 'glusterfs'/g' /Users/auser/Documents/git/glusterfs/gluster-latest/gluster-kubernetes-1.2.0/deploy/kube-templates/glusterfs-daemonset.yaml | /usr/local/bin/kubectl -n glusterfs create -f - 2>&1 daemonset.extensions/glusterfs created Waiting for GlusterFS pods to start ... Checking status of pods matching '--selector=glusterfs=pod': Flag --show-all has been deprecated, will be removed in an upcoming release glusterfs-c52qx 1/1 Running 0 5m glusterfs-x7g7w 1/1 Running 0 5m Timed out waiting for pods matching '--selector=glusterfs=pod'. pods not found.

I see the below in the glusterd.log

+------------------------------------------------------------------------------+ [2018-11-07 13:07:52.349066] I [MSGID: 101190] [event-epoll.c:617:event_dispatch_epoll_worker] 0-epoll: Started thread with index 1 [2018-11-07 13:19:19.271649] I [MSGID: 100030] [glusterfsd.c:2741:main] 0-/usr/sbin/glusterd: Started running /usr/sbin/glusterd version 4.1.5 (args: /usr/sbin/glusterd -p /var/run/glusterd.pid --log-level INFO) [2018-11-07 13:19:19.320023] I [MSGID: 106478] [glusterd.c:1423:init] 0-management: Maximum allowed open file descriptors set to 65536 [2018-11-07 13:19:19.320079] I [MSGID: 106479] [glusterd.c:1481:init] 0-management: Using /var/lib/glusterd as working directory [2018-11-07 13:19:19.320092] I [MSGID: 106479] [glusterd.c:1486:init] 0-management: Using /var/run/gluster as pid file working directory [2018-11-07 13:19:19.369486] W [MSGID: 103071] [rdma.c:4629:__gf_rdma_ctx_create] 0-rpc-transport/rdma: rdma_cm event channel creation failed [No such device] [2018-11-07 13:19:19.369525] W [MSGID: 103055] [rdma.c:4938:init] 0-rdma.management: Failed to initialize IB Device [2018-11-07 13:19:19.369539] W [rpc-transport.c:351:rpc_transport_load] 0-rpc-transport: 'rdma' initialization failed [2018-11-07 13:19:19.369676] W [rpcsvc.c:1781:rpcsvc_create_listener] 0-rpc-service: cannot create listener, initing the transport failed [2018-11-07 13:19:19.369696] E [MSGID: 106244] [glusterd.c:1764:init] 0-management: creation of 1 listeners failed, continuing with succeeded transport [2018-11-07 13:19:21.565013] E [MSGID: 101032] [store.c:441:gf_store_handle_retrieve] 0-: Path corresponding to /var/lib/glusterd/glusterd.info. [No such file or directory] [2018-11-07 13:19:21.565076] E [MSGID: 101032] [store.c:441:gf_store_handle_retrieve] 0-: Path corresponding to /var/lib/glusterd/glusterd.info. [No such file or directory] [2018-11-07 13:19:21.565084] I [MSGID: 106514] [glusterd-store.c:2262:glusterd_restore_op_version] 0-management: Detected new install. Setting op-version to maximum : 40100 [2018-11-07 13:19:21.579070] I [MSGID: 106194] [glusterd-store.c:3849:glusterd_store_retrieve_missed_snaps_list] 0-management: No missed snaps list. Final graph: +------------------------------------------------------------------------------+ 1: volume management 2: type mgmt/glusterd 3: option rpc-auth.auth-glusterfs on 4: option rpc-auth.auth-unix on 5: option rpc-auth.auth-null on 6: option rpc-auth-allow-insecure on 7: option transport.listen-backlog 10 8: option event-threads 1 9: option ping-timeout 0 10: option transport.socket.read-fail-log off 11: option transport.socket.keepalive-interval 2 12: option transport.socket.keepalive-time 10 13: option transport-type rdma 14: option working-directory /var/lib/glusterd 15: end-volume 16: +------------------------------------------------------------------------------+ [2018-11-07 13:19:21.581518] I [MSGID: 101190] [event-epoll.c:617:event_dispatch_epoll_worker] 0-epoll: Started thread with index 1

I have tried to abort and delete the below folders too: /etc/glusterfs /var/lib/glusterd I tried this with the latest release : gluster-kubernetes-1.2.0

DineshC001 commented 5 years ago

Im trying to deploy glusterfs in a kubernetes cluster that was deployed using rancher. I have 2 vm's running RHEL7. I believe I have installed all the pre-requisites listed in the setup-guide. Only difference being, I have glusterfs 3.12.2 on the vm hosting the pod and the pod is currently running glusterfs 4.1.5. Could this be the reason I see this error in the glusterd.log? 2018-11-07 23:05:15.364626] E [MSGID: 106244] [glusterd.c:1764:init] 0-management: creation of 1 listeners failed, continuing with succeeded transport

DineshC001 commented 5 years ago

Now I have glusterfs 4.1.5 on the vm but the error is still there. I am using a lvm created in /dev/mapper.

DineshC001 commented 5 years ago

Today, I finally got it working using the master branch. gluster-kubernetes-1.2.0 is a year old and a new stable release is urgently required. Even though I continue to get the above error it deploys glusterfs. But I see the below message at the end of the end of gk-deploy:

/usr/local/bin/kubectl -n glusterfs exec -i deploy-heketi-6fbbbc8b59-lkrqf -- heketi-cli -s http://localhost:8080 --user admin --secret '' setup-openshift-heketi-storage --listfile=/tmp/heketi-storage.json 2>&1 Error: Failed to allocate new volume: No space command terminated with exit code 255 Failed on setup openshift heketi storage This may indicate that the storage must be wiped and the GlusterFS nodes must be reset.

Should it not default to only running kube-templates?