Gluster Volumes Fail to Mount after reboot

sgtcoder commented 1 year ago

Description of problem: Installing GlusterFS 8, 9, or 10, all have issues starting volumes on restart

The exact command to reproduce the issue:

dnf install centos-release-gluster9 -y
nano /etc/yum.repos.d/CentOS-Gluster-9.repo

baseurl=https://dl.rockylinux.org/vault/centos/8.5.2111/storage/x86_64/gluster-9/

dnf install glusterfs glusterfs-libs glusterfs-server -y

systemctl enable --now glusterfsd glusterd

firewall-cmd --add-service=glusterfs --permanent
firewall-cmd --reload

gluster volume create storage1 storage1.sgtcoder.com:/data/brick1/storage1

gluster volume start storage1
gluster volume status storage1

nano /etc/fstab

## GLUSTER ##
127.0.0.1:/storage1    /mnt/storage     glusterfs      defaults,_netdev,acl        0       0

systemctl daemon-reload && mount -a

The full output of the command that failed:

``` Command doesn't fail. Everything works until you reboot. Once you reboot, you have to yes | /usr/sbin/gluster volume stop storage1 /usr/sbin/gluster volume start storage1 /bin/mount -a -t glusterfs No matter what I do the volumes wont go back online until I force stop the "fake" one multiple times until it says it's unmounted. Then start the volume. Then mount the volume. I tried creating a startup service and everything. I read every single documentation out there and tried hundreds of different ways and different virtual machines and nothing ever seems to work ``` **Expected results:**

Every reboot I would expect it to come back online **Mandatory info:** **- The output of the `gluster volume info` command**: ``` [root@storage1 ~]# gluster volume info Volume Name: storage1 Type: Distribute Volume ID: 955b7fa1-ff1c-41a2-b467-fffffffff Status: Started Snapshot Count: 0 Number of Bricks: 1 Transport-type: tcp Bricks: Brick1: storage1:/data/brick1/storage1 Options Reconfigured: storage.fips-mode-rchecksum: on transport.address-family: inet ``` **- The output of the `gluster volume status` command**: ``` [root@storage1 ~]# gluster volume status Status of volume: storage1 Gluster process TCP Port RDMA Port Online Pid ------------------------------------------------------------------------------ Brick storage1:/data/brick1/st orage1 57447 0 Y 381245 ``` **- The output of the `gluster volume heal` command**: ``` [root@storage1 ~]# gluster volume heal storage1 Launching heal operation to perform index self heal on volume storage1 has been unsuccessful: Self-heal-daemon is disabled. Heal will not be triggered on volume storage1 ``` **- Provide logs present on following locations of client and server nodes - /var/log/glusterfs/ **- Is there any crash ? Provide the backtrace and coredump No Crash **Additional info:**

**- The operating system / glusterfs version**: AlmaLinux 8 and GlusterFS 10 **Note: Please hide any confidential data which you don't want to share in public like IP address, file name, hostname or any other configuration**

sgtcoder commented 1 year ago

Additionally when I run these

gluster volume start storage1
volume start: storage1: failed: Volume storage1 already started

Then

gluster volume status storage1

Online: N

How can it be started but offline?

zemzema commented 1 year ago

I thing that this problem was solved in later versions of 9.x, but they are not released for Rocky Linux 8.

gluster / glusterfs

Gluster Volumes Fail to Mount after reboot #4173