Open ananbas opened 5 years ago
heketi pod log
[heketi] INFO 2019/02/24 06:27:40 Starting Node Health Status refresh
[heketi] INFO 2019/02/24 06:27:40 Cleaned 0 nodes from health cache
[negroni] Started GET /clusters
[negroni] Completed 200 OK in 176.711µs
[negroni] Started POST /clusters
[negroni] Completed 201 Created in 643.227µs
[negroni] Started POST /nodes
[cmdexec] INFO 2019/02/24 06:29:33 Check Glusterd service status in node bdprdsn0002
[heketi] INFO 2019/02/24 06:29:40 Starting Node Health Status refresh
[heketi] INFO 2019/02/24 06:29:40 Cleaned 0 nodes from health cache
[negroni] Completed 400 Bad Request in 30.001699578s
[kubeexec] ERROR 2019/02/24 06:30:03 heketi/pkg/remoteexec/kube/target.go:134:kube.TargetDaemonSet.GetTargetPod: Get https://10.96.0.1:443/api/v1/namespaces/default/pods?labelSelector=glusterfs-node: dial tcp 10.96.0.1:443: i/o timeout
[kubeexec] ERROR 2019/02/24 06:30:03 heketi/pkg/remoteexec/kube/target.go:135:kube.TargetDaemonSet.GetTargetPod: Failed to get list of pods
[cmdexec] ERROR 2019/02/24 06:30:03 heketi/executors/cmdexec/peer.go:81:cmdexec.(*CmdExecutor).GlusterdCheck: Failed to get list of pods
[heketi] ERROR 2019/02/24 06:30:03 heketi/apps/glusterfs/app_node.go:107:glusterfs.(*App).NodeAdd: Failed to get list of pods
[heketi] ERROR 2019/02/24 06:30:03 heketi/apps/glusterfs/app_node.go:108:glusterfs.(*App).NodeAdd: New Node doesn't have glusterd running
[negroni] Started POST /nodes
[cmdexec] INFO 2019/02/24 06:30:03 Check Glusterd service status in node bdprdsn0003
[negroni] Completed 400 Bad Request in 30.001886914s
[kubeexec] ERROR 2019/02/24 06:30:33 heketi/pkg/remoteexec/kube/target.go:134:kube.TargetDaemonSet.GetTargetPod: Get https://10.96.0.1:443/api/v1/namespaces/default/pods?labelSelector=glusterfs-node: dial tcp 10.96.0.1:443: i/o timeout
[kubeexec] ERROR 2019/02/24 06:30:33 heketi/pkg/remoteexec/kube/target.go:135:kube.TargetDaemonSet.GetTargetPod: Failed to get list of pods
[cmdexec] ERROR 2019/02/24 06:30:33 heketi/executors/cmdexec/peer.go:81:cmdexec.(*CmdExecutor).GlusterdCheck: Failed to get list of pods
[heketi] ERROR 2019/02/24 06:30:33 heketi/apps/glusterfs/app_node.go:107:glusterfs.(*App).NodeAdd: Failed to get list of pods
[heketi] ERROR 2019/02/24 06:30:33 heketi/apps/glusterfs/app_node.go:108:glusterfs.(*App).NodeAdd: New Node doesn't have glusterd running
[negroni] Started POST /nodes
[cmdexec] INFO 2019/02/24 06:30:33 Check Glusterd service status in node bdprdsn0004
[kubeexec] ERROR 2019/02/24 06:31:03 heketi/pkg/remoteexec/kube/target.go:134:kube.TargetDaemonSet.GetTargetPod: Get https://10.96.0.1:443/api/v1/namespaces/default/pods?labelSelector=glusterfs-node: dial tcp 10.96.0.1:443: i/o timeout
[kubeexec] ERROR 2019/02/24 06:31:03 heketi/pkg/remoteexec/kube/target.go:135:kube.TargetDaemonSet.GetTargetPod: Failed to get list of pods
[cmdexec] ERROR 2019/02/24 06:31:03 heketi/executors/cmdexec/peer.go:81:cmdexec.(*CmdExecutor).GlusterdCheck: Failed to get list of pods
[heketi] ERROR 2019/02/24 06:31:03 heketi/apps/glusterfs/app_node.go:107:glusterfs.(*App).NodeAdd: Failed to get list of pods
[heketi] ERROR 2019/02/24 06:31:03 heketi/apps/glusterfs/app_node.go:108:glusterfs.(*App).NodeAdd: New Node doesn't have glusterd running
trying curl
curl https://10.96.0.1:443/api/v1/namespaces/default/pods?labelSelector=glusterfs-node -k
{
"kind": "Status",
"apiVersion": "v1",
"metadata": {
},
"status": "Failure",
"message": "pods is forbidden: User \"system:anonymous\" cannot list resource \"pods\" in API group \"\" in the namespace \"default\"",
"reason": "Forbidden",
"details": {
"kind": "pods"
},
"code": 403
}
hmm, i think it is permission error?
now. I got different error, Error: Failed to allocate new volume: No online storage devices in cluster
Checking status of pods matching '--selector=deploy-heketi=pod':
deploy-heketi-5f6c465bb8-dcr9l 1/1 Running 0 16s
OK
Determining heketi service URL ... OK
/bin/kubectl -n default exec -i deploy-heketi-5f6c465bb8-dcr9l -- heketi-cli -s http://localhost:8080 --user admin --secret '' topology load --json=/etc/heketi/topology.json 2>&1
Creating cluster ... ID: 833a3f14267b9d2708a1a36aee21752f
Allowing file volumes on cluster.
Allowing block volumes on cluster.
Creating node bdprdsn0002 ... ID: 84ee287304ae70226d6c068ad5b0021a
Adding device /dev/sdc ... return 0
heketi topology loaded.
/bin/kubectl -n default exec -i deploy-heketi-5f6c465bb8-dcr9l -- heketi-cli -s http://localhost:8080 --user admin --secret '' setup-openshift-heketi-storage --listfile=/tmp/heketi-storage.json 2>&1
Error: Failed to allocate new volume: No online storage devices in cluster
command terminated with exit code 255
Failed on setup openshift heketi storage
This may indicate that the storage must be wiped and the GlusterFS nodes must be reset.```
and executing manually in master nodes got too many open files in heketi pod
[root@bdprdmn0003 deploy]# heketi-cli topology load --json=topology.json
Found node bdprdsn0002 on cluster 833a3f14267b9d2708a1a36aee21752f
Adding device /dev/sdc ... Unable to add device: WARNING: Device /dev/sdc not initialized in udev database even after waiting 10000000 microseconds.
WARNING: Device /dev/vg-root/lv_root not initialized in udev database even after waiting 10000000 microseconds.
WARNING: Device /dev/sda1 not initialized in udev database even after waiting 10000000 microseconds.
WARNING: Device /dev/vg-root/lv_usr not initialized in udev database even after waiting 10000000 microseconds.
WARNING: Device /dev/sda2 not initialized in udev database even after waiting 10000000 microseconds.
WARNING: Device /dev/vg_data/lv_opt not initialized in udev database even after waiting 10000000 microseconds.
WARNING: Device /dev/sda3 not initialized in udev database even after waiting 10000000 microseconds.
WARNING: Device /dev/vg_data/lv_home not initialized in udev database even after waiting 10000000 microseconds.
WARNING: Device /dev/sda5 not initialized in udev database even after waiting 10000000 microseconds.
WARNING: Device /dev/sda6 not initialized in udev database even after waiting 10000000 microseconds.
WARNING: Device /dev/vg-root/lv_tmp not initialized in udev database even after waiting 10000000 microseconds.
WARNING: Device /dev/vg-root/lv_var not initialized in udev database even after waiting 10000000 microseconds.
WARNING: Device /dev/sdb1 not initialized in udev database even after waiting 10000000 microseconds.
WARNING: Device /dev/sdc not initialized in udev database even after waiting 10000000 microseconds.
Can't initialize physical volume "/dev/sdc" of volume group "vg_f69a0ab1d705f0a84b648f25d5018809" without -ff
/dev/sdc: physical volume not initialized.
Creating node bdprdsn0003 ... ID: add9a6ab175d2b4f292d8c2675b84b31
Adding device /dev/sdc ... OK
Creating node bdprdsn0004 ... ID: bde73d725835bf12c3284e9d98c6ef08
Adding device /dev/sdc ... Unable to add device: Get http://10.100.30.141:8080/queue/971117d58b46fc72ddc39a8d0a1f3681: dial tcp 10.100.30.141:8080: socket: too many open files
resuming ./gk-deploy -gv got another error on setup-openshift-heketi-storage
Found node bdprdsn0002 on cluster 833a3f14267b9d2708a1a36aee21752f
Found device /dev/sdc
Found node bdprdsn0003 on cluster 833a3f14267b9d2708a1a36aee21752f
Found device /dev/sdc
Found node bdprdsn0004 on cluster 833a3f14267b9d2708a1a36aee21752f
Found device /dev/sdc
heketi topology loaded.
/bin/kubectl -n default exec -i deploy-heketi-5f6c465bb8-dcr9l -- heketi-cli -s http://localhost:8080 --user admin --secret '' setup-openshift-heketi-storage --listfile=/tmp/heketi-storage.json 2>&1
/bin/kubectl -n default exec -i deploy-heketi-5f6c465bb8-dcr9l -- cat /tmp/heketi-storage.json | /bin/kubectl -n default create -f - 2>&1
cat: /tmp/heketi-storage.json: No such file or directory
command terminated with exit code 1
error: no objects passed to create
Failed on creating heketi storage resources.
just wow
Got any answer for that?
I have a similar error, any idea?
/usr/local/bin/kubectl -n default exec -i deploy-heketi-5f6c465bb8-d4j7n -- heketi-cli -s http://localhost:8080 --user admin --secret '' topology load --json=/etc/heketi/topology.json 2>&1
Found node ip-10-44-10-51.us-west-1.compute.internal on cluster 9dfb7ecab0b39aae0e0e6b12c235a763
Adding device /dev/xvdf ... return 0
heketi topology loaded.
/usr/local/bin/kubectl -n default exec -i deploy-heketi-5f6c465bb8-d4j7n -- heketi-cli -s http://localhost:8080 --user admin --secret '' setup-openshift-heketi-storage --listfile=/tmp/heketi-storage.json 2>&1
/usr/local/bin/kubectl -n default exec -i deploy-heketi-5f6c465bb8-d4j7n -- cat /tmp/heketi-storage.json | /usr/local/bin/kubectl -n default create -f - 2>&1
cat: /tmp/heketi-storage.json: No such file or directory
command terminated with exit code 1
error: no objects passed to create
Failed on creating heketi storage resources.
@ananbas I also encountered this problem. How did you solve it
Hi Lucian,
I gave up on this and use openebs instead
BR,
On 20 Jul 2019 09.43 +0700, lucian521 notifications@github.com, wrote:
@ananbas I also encountered this problem. How did you solve it — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or mute the thread.
@ananbas Hi , The reason for this problem is that the pod is not connected to the service network, and even the pod cannot access the apiservice. When starting the virtual machine,node should be started first, and then master should be started after node is started heketi logs: [kubeexec] ERROR 2019/02/24 06:30:33 heketi/pkg/remoteexec/kube/target.go:134:kube.TargetDaemonSet.GetTargetPod: Get https://10.96.0.1:443/api/v1/namespaces/default/pods?labelSelector=glusterfs-node: dial tcp 10.96.0.1:443: i/o timeout [kubeexec] ERROR 2019/02/24 06:30:33 heketi/pkg/remoteexec/kube/target.go:135:kube.TargetDaemonSet.GetTargetPod: Failed to get list of pods [cmdexec] ERROR 2019/02/24 06:30:33 heketi/executors/cmdexec/peer.go:81:cmdexec.(CmdExecutor).GlusterdCheck: Failed to get list of pods [heketi] ERROR 2019/02/24 06:30:33 heketi/apps/glusterfs/app_node.go:107:glusterfs.(App).NodeAdd: Failed to get list of pods
I'm running kubernetes in 3 masters, 3 workers. when deploying gk-deploy, always stuck in heketi topology load. kernel modules loaded. check.
it's been two days, huge amount of browsing with related search with no working solution, so many ./gk-deploy -gv --abort && ./gk-deploy -gv :)
Run heketi topology load from inside pod got no information, just exit immediately
Run heketi topology load in master node, got Unable to create node: New Node doesn't have glusterd running
kubectl get pod
kubectl get nodes
topology file
/dev/sdc is 1TB hdd, no pvs, unformatted.
edit: firewalld disabled. selinux disabled. swap disabled.
thank you
BR, Anung