4.1 image is really slow with Heketi and fails Install on OKD

gluster / gluster-containers

Dockerfiles (CentOS, Fedora, Red Hat) for GlusterFS

https://github.com/gluster/gluster-containers/pkgs/container/gluster-containers

222 stars 135 forks source link

4.1 image is really slow with Heketi and fails Install on OKD #128

Open jocelynthode opened 5 years ago

jocelynthode commented 5 years ago

Hey, I noticed that using the image tagged for 4.1, I cannot install GlusterFS in OKD due to the error fixed by https://github.com/gluster/gluster-containers/pull/126

Can we please cherry-pick https://github.com/gluster/gluster-containers/pull/126 to 4.1 branch and rebuild images ?

@humblec @mcquinne

humblec commented 5 years ago

@jocelynthode I have cherrypicked that patch and also triggered the build.. Lets see how it goes.

humblec commented 5 years ago

@jocelynthode you have new image with the fix. please let me know if it helps!

jocelynthode commented 5 years ago

@humblec Uh, I thought this PR would fix my issue. However reading the Bug reports I see that there is also a bug in LVM which I seem to hit with the unitialized udev:

        Device /dev/sdd not initialized in udev database (1/100, 0 microseconds).
        Device /dev/sdd not initialized in udev database (2/100, 100000 microseconds).
        Device /dev/sdd not initialized in udev database (3/100, 200000 microseconds).
        Device /dev/sdd not initialized in udev database (4/100, 300000 microseconds).
        Device /dev/sdd not initialized in udev database (5/100, 400000 microseconds).
        Device /dev/sdd not initialized in udev database (6/100, 500000 microseconds).
        Device /dev/sdd not initialized in udev database (7/100, 600000 microseconds).
        Device /dev/sdd not initialized in udev database (8/100, 700000 microseconds).

This is a bit out of my comfort zone. Is there a workaround for this or do we need to wait for a patched LVM version ?

jocelynthode commented 5 years ago

@humblec To note, I did not encounter any of these issue while using an old version of the container which had gluster 4.1.6 in it

jocelynthode commented 5 years ago

@humblec : Here's what I noticed between the two builds that encountered the problem: The lvm2 packages are slightly different on 4.1.6 we had lvm2-libs-2.02.180-10.el7_6.2.x8664 and lvm2-2.02.180-10.el76.2.x8664 and on 4.1.7 we have lvm2-libs-2.02.180-10.el76.3.x8664 and lvm2-2.02.180-10.el76.3.x86_64

4.1.6

sh-4.2# rpm -qa |egrep 'lvm2|udev|systemd'
python-pyudev-0.15-9.el7.noarch
systemd-219-62.el7.x86_64
lvm2-libs-2.02.180-10.el7_6.2.x86_64
lvm2-2.02.180-10.el7_6.2.x86_64
systemd-libs-219-62.el7.x86_64
systemd-sysv-219-62.el7.x86_64

4.1.7

sh-4.2# rpm -qa |egrep 'lvm2|udev|systemd'
python-pyudev-0.15-9.el7.noarch
systemd-219-62.el7_6.3.x86_64
lvm2-libs-2.02.180-10.el7_6.3.x86_64
lvm2-2.02.180-10.el7_6.3.x86_64
systemd-libs-219-62.el7_6.3.x86_64
systemd-sysv-219-62.el7_6.3.x86_64

jocelynthode commented 5 years ago

@humblec RedHat has downgraded their lvm packages (https://bugzilla.redhat.com/show_bug.cgi?id=1676921). Can we please do the same in gluster image so that it starts working again ?

I could try and submit a PR if you want

alibo commented 5 years ago

I've also faced this issue with version 4.1.7, is there any eta on fixing it?

jocelynthode commented 5 years ago

@alibo : As I still did not get an answer from @humblec to rebuild the 4.1 image and setup dockerhub automated builds, I have rebuilt from source the 4.1 image and pushed it to my dockerhub repository if you don't want to rebuild it: https://hub.docker.com/r/jocelynthode/gluster-centos/tags

nixpanic commented 5 years ago

The change is already included with #134. Unfortunately there was no new image build and pushed to Docher Hub?! The latest gluster4u1_centos7 was build on 18 February... https://hub.docker.com/r/gluster/gluster-centos/tags

@humblec could you check why automated image builds fail again?

grig-tar commented 5 years ago

[offtopic] @nixpanic you have automatic build which rewrite the tag 'gluster4u1_centos7' every time? How can I know about build update? You must increase the version number of image when update it. [/offtopic]

niiku commented 5 years ago

I just wanted to add that after a failed installation the /var/lib/glusterd directory on the gluster nodes must be deleted before the image from @jocelynthode (thanks!) works. Sadly, the gluster/gluster-centos:latest image is still not rebuild.

Arano-kai commented 4 years ago

Deploying okd recently and catch same error. Have to rebuild gluster-4.1 image with downgrade statement commented and vent fine afterwards. Current CentOS have lvm2-libs-2.02.185-2.el7_7.2.x86_64 and lvm2-2.02.185-2.el7_7.2.x86_64.