Closed timkock closed 5 years ago
What can I do to facilitate someone to help me with this. I already feel quite bad haha that I am bothering you with this as the work on this repo was already awesome.
Would be great to have an update on how to resolve this issue
I missed this issue when it was originally raised.
Since you mention gluster pods I will assume you're using containerized gluster as opposed to external gluster. Can you please update with your gluster container image and version?
As you discovered, the gluster pods don't use the typical /etc/fstab. The /var/lib/heketi/fstab that serves the same purpose is used by a custom start up script in the container to mount the brick file systems. You mention that it doesn't work. Could you please provide more details about what errors occurred when you tried to mount the content of the file yourself?
You mention Azure. I have heard that azure does not keep device names like /dev/sdX stable. Is it possible that when you do your test that devices that were referenced by one name changes to another name after the simulated crash?
Hi there I have same problem
Before node reboot:
[root@k8s-01 /]# lsblk
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
sda 8:0 0 40G 0 disk
└─sda1 8:1 0 40G 0 part /var/lib/glusterd
sdb 8:16 0 40G 0 disk
├─vg_09203dbe2b452ba01d8289ee5a1489b1-tp_8a586f6fdc926a5711ff0ebda31a93e8_tmeta 253:0 0 12M 0 lvm
│ └─vg_09203dbe2b452ba01d8289ee5a1489b1-tp_8a586f6fdc926a5711ff0ebda31a93e8-tpool 253:2 0 2G 0 lvm
│ ├─vg_09203dbe2b452ba01d8289ee5a1489b1-tp_8a586f6fdc926a5711ff0ebda31a93e8 253:3 0 2G 0 lvm
│ └─vg_09203dbe2b452ba01d8289ee5a1489b1-brick_8a586f6fdc926a5711ff0ebda31a93e8 253:4 0 2G 0 lvm /var/lib/heketi/mounts/vg_09203dbe2b452ba01d8289ee5a1489b1/brick_8a586f6fdc926a5711ff0ebda31a93e8
└─vg_09203dbe2b452ba01d8289ee5a1489b1-tp_8a586f6fdc926a5711ff0ebda31a93e8_tdata 253:1 0 2G 0 lvm
└─vg_09203dbe2b452ba01d8289ee5a1489b1-tp_8a586f6fdc926a5711ff0ebda31a93e8-tpool 253:2 0 2G 0 lvm
├─vg_09203dbe2b452ba01d8289ee5a1489b1-tp_8a586f6fdc926a5711ff0ebda31a93e8 253:3 0 2G 0 lvm
└─vg_09203dbe2b452ba01d8289ee5a1489b1-brick_8a586f6fdc926a5711ff0ebda31a93e8 253:4 0 2G 0 lvm /var/lib/heketi/mounts/vg_09203dbe2b452ba01d8289ee5a1489b1/brick_8a586f6fdc926a5711ff0ebda31a93e8
after node reboot + daemon reboot:
[root@k8s-01 /]# lsblk
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
sda 8:0 0 40G 0 disk
└─sda1 8:1 0 40G 0 part /var/lib/glusterd
sdb 8:16 0 40G 0 disk
How to mount all the /var/lib/heketi/fstab
?
I tried:
[root@k8s-01 /]# mount -a --fstab /var/lib/heketi/fstab
mount: special device /dev/mapper/vg_67043fcaa37dcb3b5560a4d70cdde6e8-brick_bbdf160150776be346a650c278cb101b does not exist
mount: special device /dev/mapper/vg_89aa25beca04a1ed8b26f1d3d916abd2-brick_a42ca5903123ba1a9204af946aa32d55 does not exist
mount: special device /dev/mapper/vg_09203dbe2b452ba01d8289ee5a1489b1-brick_8a586f6fdc926a5711ff0ebda31a93e8 does not exist
...
[root@k8s-01 /]# gluster-setup.sh
mkdir: cannot create directory ‘/var/log/glusterfs/container’: File exists
/etc/glusterfs is not empty
/var/log/glusterfs is not empty
/var/lib/glusterd is not empty
mount: special device /dev/mapper/vg_67043fcaa37dcb3b5560a4d70cdde6e8-brick_bbdf160150776be346a650c278cb101b does not exist
mount: special device /dev/mapper/vg_89aa25beca04a1ed8b26f1d3d916abd2-brick_a42ca5903123ba1a9204af946aa32d55 does not exist
mount: special device /dev/mapper/vg_09203dbe2b452ba01d8289ee5a1489b1-brick_8a586f6fdc926a5711ff0ebda31a93e8 does not exist
/usr/sbin/gluster-setup.sh: line 83: [: 4 /var/log/glusterfs/container/failed_bricks: integer expression expected
Script Ran Successfully
...
[root@k8s-01 /]# lsblk
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
sda 8:0 0 40G 0 disk
└─sda1 8:1 0 40G 0 part /var/lib/heketi
sdb 8:16 0 40G 0 disk
Re,
I find for me ...
This modules wasn't persist after reboot
@OlivierMary depending on your os/distro you want to make sure the device mapper modules are loaded. We try to make it possible to allow the pod to auto-load kernel modules (see https://github.com/gluster/gluster-kubernetes/blob/master/deploy/kube-templates/glusterfs-daemonset.yaml#L63 ) but if those modules are not loaded even after your pod starts it might be that either that line or mount point does not exist. It depends on what version you used.
We recently faced exact the same issue with our GlusterFS cluster running in Kubernetes on Azure. I found out that the mappings in /dev/mapper were missing compared to the underlying host system when I executed "blkid" in the GlusterFS pods.
By adding the host path "/dev" to the daemon set and restarting all pods regarding GlusterFS, I got our cluster back up online.
I hope this information is helpful.
OS: CentOS Linux release 7.5.1804 Distribution (Kubernetes): OpenShift OKD 3.11
@nixpanic do you think the comment by @daskanu here could be related to what you did in 2f1114a, similar to #542 ?
Hmm, yes, that seems possible. I guess we'll need to add /dev/mapper
as well. Users may have configured multipath (or other device-mapper targets) and in that case they would want to pass /dev/mapper/...
device names.
sigh
I found that in the instance where udev has created the files in /dev/mapper as symlinks, then they are not available to the gluster pod. If I manually delete them (rm -f /dev/mapper/vg_* #CarefulHere), and then recreate them by running vgscan --mknodes which then defaults to "direct link creation" then this resolves the problem. It's of course necessary to restart the gluster pods on the individual nodes.
I.e. this does not work for me:
[root@srv04 mapper]# ls -al total 0 drwxr-xr-x 2 root root 200 Jun 27 15:22 . drwxr-xr-x 21 root root 4400 Jul 1 12:04 .. crw------- 1 root root 10, 236 Jun 27 15:22 control lrwxrwxrwx 1 root root 7 Jun 27 15:22 fedora-root -> ../dm-0 lrwxrwxrwx 1 root root 7 Jun 27 15:22 fedora-swap -> ../dm-1 lrwxrwxrwx 1 root root 7 Jun 27 15:22 vg_4cca2fc8cc3671bcf8c482f156f0438f-brick_5b1459e816925bc320e5a0ef7d284fd9 -> ../dm-6 lrwxrwxrwx 1 root root 7 Jun 27 15:22 vg_4cca2fc8cc3671bcf8c482f156f0438f-tp_5b1459e816925bc320e5a0ef7d284fd9 -> ../dm-5 lrwxrwxrwx 1 root root 7 Jun 27 15:22 vg_4cca2fc8cc3671bcf8c482f156f0438f-tp_5b1459e816925bc320e5a0ef7d284fd9_tdata -> ../dm-3 lrwxrwxrwx 1 root root 7 Jun 27 15:22 vg_4cca2fc8cc3671bcf8c482f156f0438f-tp_5b1459e816925bc320e5a0ef7d284fd9_tmeta -> ../dm-2 lrwxrwxrwx 1 root root 7 Jun 27 15:22 vg_4cca2fc8cc3671bcf8c482f156f0438f-tp_5b1459e816925bc320e5a0ef7d284fd9-tpool -> ../dm-4
but this works a treat:
[root@srv04 mapper]# ls -al total 0 drwxr-xr-x 2 root root 300 Jul 1 18:01 . drwxr-xr-x 21 root root 4400 Jul 1 12:04 .. crw------- 1 root root 10, 236 Jun 27 15:22 control lrwxrwxrwx 1 root root 7 Jun 27 15:22 fedora-root -> ../dm-0 lrwxrwxrwx 1 root root 7 Jun 27 15:22 fedora-swap -> ../dm-1 brw-rw---- 1 root disk 253, 6 Jul 1 18:01 vg_4cca2fc8cc3671bcf8c482f156f0438f-brick_5b1459e816925bc320e5a0ef7d284fd9 brw-rw---- 1 root disk 253, 11 Jul 1 18:01 vg_4cca2fc8cc3671bcf8c482f156f0438f-brick_9825c39fb4efe83c80bb8aef85eef8ee brw-rw---- 1 root disk 253, 5 Jul 1 18:01 vg_4cca2fc8cc3671bcf8c482f156f0438f-tp_5b1459e816925bc320e5a0ef7d284fd9 brw-rw---- 1 root disk 253, 3 Jul 1 18:01 vg_4cca2fc8cc3671bcf8c482f156f0438f-tp_5b1459e816925bc320e5a0ef7d284fd9_tdata brw-rw---- 1 root disk 253, 2 Jul 1 18:01 vg_4cca2fc8cc3671bcf8c482f156f0438f-tp_5b1459e816925bc320e5a0ef7d284fd9_tmeta brw-rw---- 1 root disk 253, 4 Jul 1 18:01 vg_4cca2fc8cc3671bcf8c482f156f0438f-tp_5b1459e816925bc320e5a0ef7d284fd9-tpool brw-rw---- 1 root disk 253, 10 Jul 1 18:01 vg_4cca2fc8cc3671bcf8c482f156f0438f-tp_9825c39fb4efe83c80bb8aef85eef8ee brw-rw---- 1 root disk 253, 8 Jul 1 18:01 vg_4cca2fc8cc3671bcf8c482f156f0438f-tp_9825c39fb4efe83c80bb8aef85eef8ee_tdata brw-rw---- 1 root disk 253, 7 Jul 1 18:01 vg_4cca2fc8cc3671bcf8c482f156f0438f-tp_9825c39fb4efe83c80bb8aef85eef8ee_tmeta brw-rw---- 1 root disk 253, 9 Jul 1 18:01 vg_4cca2fc8cc3671bcf8c482f156f0438f-tp_9825c39fb4efe83c80bb8aef85eef8ee-tpool
and the path between them is simply :
[root@srv04 mapper]# rm -f vg_4cca2fc8cc3671bcf8c482f156f0438f-* [root@srv04 mapper]# vgscan --mknodes Reading volume groups from cache. Found volume group "vg_4cca2fc8cc3671bcf8c482f156f0438f" using metadata type lvm2 Found volume group "fedora" using metadata type lvm2 The link /dev/vg_4cca2fc8cc3671bcf8c482f156f0438f/brick_9825c39fb4efe83c80bb8aef85eef8ee should have been created by udev but it was not found. Falling back to direct link creation. Command failed with status code 5.
Hope that helps.
Kudos @lasselj : thank you for your instructions: you helped me bringing back my heketidbstorage
volume which got read-only due to missing bricks.
Hey, @lasselj! Thanks a lot, mate. You saved my day!
Excellent work in this repository, really awesome. It was a little tricky to get to work with
acs-engine
and premium managed disks in Azure but it has delivered a lot of joy to an autoscaling machine learning environment using the excellent work fromdask
andkubernetes
.I have encountered a problem where I simulated a crash of the storage nodes (3 VM's in a WMSS scaleset with each 256G SSD's attached) by deallocating them and bringing them back online via the portal.
The nodes come back online and re-appear in kubernetes.
The
heketi
pod fails withthe
gluster
pods start and they are able to see each other. Checked withgluster peer status
when running
gluster volume status
I get thisStatus of volume: heketidbstorage Gluster process TCP Port RDMA Port Online Pid
Brick 10.240.0.34:/var/lib/heketi/mounts/vg _51efbe5a164aea2fa33494e18a9ffbf9/brick_29d 842fea9638032b7f56a9e0535d3fe/brick N/A N/A N N/A Brick 10.240.0.65:/var/lib/heketi/mounts/vg _d3637b69dbc9f7f16c2ec57970e91ec6/brick_05f 7605bb3a9cdfa23688bef3d437bc4/brick N/A N/A N N/A Brick 10.240.0.96:/var/lib/heketi/mounts/vg _ff83b346d75d6f006058760a9ea7612e/brick_12d c5a9894b5d1d35fc758bf93a3c192/brick N/A N/A N N/A Self-heal Daemon on localhost N/A N/A Y 1463 Self-heal Daemon on 10.240.0.96 N/A N/A Y 1008 Self-heal Daemon on 10.240.0.34 N/A N/A Y 1100
Task Status of Volume heketidbstorage
There are no active volume tasks
Status of volume: vol_0914949f0473811c7c52cd063485778e Gluster process TCP Port RDMA Port Online Pid
Brick 10.240.0.96:/var/lib/heketi/mounts/vg _ff83b346d75d6f006058760a9ea7612e/brick_c53 b8198f3bd0d24b157b1e3f20c462b/brick N/A N/A N N/A Brick 10.240.0.34:/var/lib/heketi/mounts/vg _bce1174abeceb199416b126276164dda/brick_42a d33ec09f0ec78378a32ca55929056/brick N/A N/A N N/A Brick 10.240.0.65:/var/lib/heketi/mounts/vg _ca08d758b917b9b59d7a118701109e19/brick_59a 845a8bec3ec44fe48535f69cf8749/brick N/A N/A N N/A Brick 10.240.0.96:/var/lib/heketi/mounts/vg _c64eb42a12eccd3a2b38d09e8239b395/brick_27c c9ae0695607a1412551f4b19c7795/brick N/A N/A N N/A Brick 10.240.0.34:/var/lib/heketi/mounts/vg _731de882b06e8688547a776b17f2b121/brick_8dd ce90c41efa54aca645d486acc6363/brick N/A N/A N N/A Brick 10.240.0.65:/var/lib/heketi/mounts/vg _9818c228513387be593ae0f664f52637/brick_bf6 1579ff96f10041bf64b0ef4880245/brick N/A N/A N N/A Self-heal Daemon on localhost N/A N/A Y 1463 Self-heal Daemon on 10.240.0.96 N/A N/A Y 1008 Self-heal Daemon on 10.240.0.34 N/A N/A Y 1100
Task Status of Volume vol_0914949f0473811c7c52cd063485778e
There are no active volume tasks