gluster / glusterfs

Gluster Filesystem : Build your distributed storage in minutes
https://www.gluster.org
GNU General Public License v2.0
4.69k stars 1.08k forks source link

[bug:1779616] Glusterfs mount-dir is getting read-only system after 1 day #896

Closed gluster-ant closed 3 years ago

gluster-ant commented 4 years ago

URL: https://bugzilla.redhat.com/1779616 Creator: sharmaakshay890 at gmail Time: 20191204T11:34:10

Description of problem: We are mounting kubernetes pods in glusterfs. after keeping setup for long-run , we can see vm itself is not accessible. pods which we are mounting are(kafka,etcd,logstash,influxdb)

Version-Release number of selected component (if applicable): Glusterfs version - 5.9 linux version - 4.4.0-131-generic #157-Ubuntu SMP Thu Jul 12 15:51:36 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux Kubernetes version - 1.9.5

How reproducible: keeping setup for long-run on virtualbox vms with ram 24Gb and cpu 8

Actual results: Vm got unaccessible, even can't able to ssh.

logs : (/var/log/glusterfs/bricks) [2019-11-28 04:53:12.195465] E [MSGID: 113072] [posix-inode-fd-ops.c:1905:posix_writev] 0-gluster-volume-posix: write failed: offset 0, [Read-only file system] [2019-11-28 04:53:12.195523] E [MSGID: 115067] [server-rpc-fops_v2.c:1373:server4_writev_cbk] 0-gluster-volume-server: 2565966: WRITEV 173 (ece0dcd6-9c14-4b94-bcd3-c8559c299852), client: CTX_ID:cb52c772-77e8-44be-b2c2-9b7b87ef8f7a-GRAPH_ID:0-PID:21526-HOST:deploy1-PC_NAME:gluster-volume-client-0-RECON_NO:-0, error-xlator: gluster-volume-posix [Read-only file system] [2019-11-28 04:53:12.195673] E [MSGID: 115067] [server-rpc-fops_v2.c:1373:server4_writev_cbk] 0-gluster-volume-server: 2565967: WRITEV 232 (a99243ad-3346-4d5c-951c-f5e821c98bfe), client: CTX_ID:cb52c772-77e8-44be-b2c2-9b7b87ef8f7a-GRAPH_ID:0-PID:21526-HOST:deploy1-PC_NAME:gluster-volume-client-0-RECON_NO:-0, error-xlator: gluster-volume-posix [Read-only file system] [2019-11-28 04:53:12.195923] E [MSGID: 113072] [posix-inode-fd-ops.c:1905:posix_writev] 0-gluster-volume-posix: write failed: offset 4096, [Read-only file system] [2019-11-28 04:53:12.195993] E [MSGID: 115067] [server-rpc-fops_v2.c:1373:server4_writev_cbk] 0-gluster-volume-server: 2565968: WRITEV 45 (539acf44-e3e4-4083-95b8-98f08380a8eb), client: CTX_ID:cb52c772-77e8-44be-b2c2-9b7b87ef8f7a-GRAPH_ID:0-PID:21526-HOST:deploy1-PC_NAME:gluster-volume-client-0-RECON_NO:-0, error-xlator: gluster-volume-posix [Read-only file system] [2019-11-28 04:53:12.196199] E [MSGID: 113072] [posix-inode-fd-ops.c:1905:posix_writev] 0-gluster-volume-posix: write failed: offset 61440, [Read-only file system] [2019-11-28 04:53:12.196284] E [MSGID: 113072] [posix-inode-fd-ops.c:1905:posix_writev] 0-gluster-volume-posix: write failed: offset 40960, [Read-only file system] [2019-11-28 04:53:12.196306] E [MSGID: 115067] [server-rpc-fops_v2.c:1373:server4_writev_cbk] 0-gluster-volume-server: 2565970: WRITEV 90 (3bd318e9-fd6b-46af-8b82-ce9f4ce22934), client: CTX_ID:cb52c772-77e8-44be-b2c2-9b7b87ef8f7a-GRAPH_ID:0-PID:21526-HOST:deploy1-PC_NAME:gluster-volume-client-0-RECON_NO:-0, error-xlator: gluster-volume-posix [Read-only file system] [2019-11-28 04:53:12.196356] E [MSGID: 115067] [server-rpc-fops_v2.c:1373:server4_writev_cbk] 0-gluster-volume-server: 2565973: WRITEV 89 (2fce19d6-ae20-4b29-a381-63ef15b876eb), client: CTX_ID:cb52c772-77e8-44be-b2c2-9b7b87ef8f7a-GRAPH_ID:0-PID:21526-HOST:deploy1-PC_NAME:gluster-volume-client-0-RECON_NO:-0, error-xlator: gluster-volume-posix [Read-only file system] [2019-11-28 04:53:12.196370] E [MSGID: 113072] [posix-inode-fd-ops.c:1905:posix_writev] 0-gluster-volume-posix: write failed: offset 0, [Read-only file system] [2019-11-28 04:53:12.196511] E [MSGID: 115067] [server-rpc-fops_v2.c:1373:server4_writev_cbk] 0-gluster-volume-server: 2565971: WRITEV 179 (470e3858-f97a-4ebe-9c9c-b878879c9130), client: CTX_ID:cb52c772-77e8-44be-b2c2-9b7b87ef8f7a-GRAPH_ID:0-PID:21526-HOST:deploy1-PC_NAME:gluster-volume-client-0-RECON_NO:-0, error-xlator: gluster-volume-posix [Read-only file system] [2019-11-28 04:53:12.196551] E [MSGID: 113072] [posix-inode-fd-ops.c:1905:posix_writev] 0-gluster-volume-posix: write failed: offset 0, [Read-only file system]


/var/log/syslog

deploy1 kernel: [89146.220855] blk_update_request: I/O error, dev sdb, sector 38523272 170912 Nov 28 04:53:11 deploy1 kernel: [89146.222122] EXT4-fs warning (device sdb1): ext4_end_bio:330: I/O error -5 writing to inode 1048903 (offset 0 siz e 0 starting block 4815410) 170913 Nov 28 04:53:11 deploy1 kernel: [89146.222132] Buffer I/O error on device sdb1, logical block 4815153 170914 Nov 28 04:53:11 deploy1 kernel: [89146.223425] sd 3:0:0:0: rejecting I/O to offline device 170915 Nov 28 04:53:11 deploy1 kernel: [89146.224577] sd 3:0:0:0: [sdb] killing request 170916 Nov 28 04:53:11 deploy1 kernel: [89146.224590] sd 3:0:0:0: rejecting I/O to offline device 170917 Nov 28 04:53:11 deploy1 kernel: [89146.225752] EXT4-fs warning (device sdb1): ext4_end_bio:330: I/O error -5 writing to inode 1315505 (offset 89047 04 size 20480 starting block 5472643) 170918 Nov 28 04:53:11 deploy1 kernel: [89146.225759] Buffer I/O error on device sdb1, logical block 5472381 170919 Nov 28 04:53:11 deploy1 kernel: [89146.226883] Buffer I/O error on device sdb1, logical block 5472382 170920 Nov 28 04:53:11 deploy1 kernel: [89146.227674] Buffer I/O error on device sdb1, logical block 5472383 170921 Nov 28 04:53:11 deploy1 kernel: [89146.228241] Buffer I/O error on device sdb1, logical block 5472384 170922 Nov 28 04:53:11 deploy1 kernel: [89146.228834] Buffer I/O error on device sdb1, logical block 5472385 170923 Nov 28 04:53:11 deploy1 kernel: [89146.229429] Buffer I/O error on device sdb1, logical block 5472386 170924 Nov 28 04:53:11 deploy1 kernel: [89146.229991] sd 3:0:0:0: rejecting I/O to offline device 170925 Nov 28 04:53:11 deploy1 kernel: [89146.230517] EXT4-fs warning (device sdb1): ext4_end_bio:330: I/O error -5 writing to inode 1053838 (offset 0 siz e 0 starting block 5490330) 170926 Nov 28 04:53:11 deploy1 kernel: [89146.230521] Buffer I/O error on device sdb1, logical block 5490073 170927 Nov 28 04:53:11 deploy1 kernel: [89146.231046] EXT4-fs warning (device sdb1): ext4_end_bio:330: I/O error -5 writing to inode 1053838 (offset 16793 60 size 12288 starting block 5490333) 170928 Nov 28 04:53:11 deploy1 kernel: [89146.231050] Buffer I/O error on device sdb1, logical block 5490074 170929 Nov 28 04:53:11 deploy1 kernel: [89146.231581] Buffer I/O error on device sdb1, logical block 5490075 170930 Nov 28 04:53:11 deploy1 kernel: [89146.232119] sd 3:0:0:0: rejecting I/O to offline device 170931 Nov 28 04:53:11 deploy1 kernel: [89146.232648] EXT4-fs warning (device sdb1): ext4_end_bio:330: I/O error -5 writing to inode 1053886 (offset 62341 12 size 77824 starting block 5498629) 170932 Nov 28 04:53:11 deploy1 kernel: [89146.232676] sd 3:0:0:0: rejecting I/O to offline device 170933 Nov 28 04:53:11 deploy1 kernel: [89146.233197] EXT4-fs warning (device sdb1): ext4_end_bio:330: I/O error -5 writing to inode 1053892 (offset 62341 12 size 77824 starting block 5500677) 170934 Nov 28 04:53:11 deploy1 kernel: [89146.233229] sd 3:0:0:0: rejecting I/O to offline device 170935 Nov 28 04:53:11 deploy1 kernel: [89146.233739] EXT4-fs warning (device sdb1): ext4_end_bio:330: I/O error -5 writing to inode 1053891 (offset 62341 12 size 77824 starting block 5502725) 170936 Nov 28 04:53:11 deploy1 kernel: [89146.233766] sd 3:0:0:0: rejecting I/O to offline device

Expected results: volume mount should be accessible all the time

Additional info: we are doing static provisioning of glusterfs volume mount in k8s.

Creating glusterfs using our own scripts with the following commands sudo gluster peer probe sudo gluster volume create sudo gluster volume start

we are mounting glusterfs in secondary disk of linux (eg /dev/sdb)

while cleaning up:- sudo gluster peer detach sudo gluster volume stop sudo gluster volume delete we are unmounting the disk and formatting the disk using (ext4 type)

Please let me know , if i'm missing any command or not following proper procedure.

gluster-ant commented 4 years ago

Time: 20191209T13:24:47 sabose at redhat commented: Rafi, can you take a look?

gluster-ant commented 4 years ago

Time: 20191211T18:18:31 rkavunga at redhat commented: It looks like the backend brick mount is corrupted. The system logs suggest an i/o error, and mostly due to a hardware failure or will be related to the storage device. The errors from gluster brick logs are also from the POSIX layer, the layer where gluster talks to the backend device. So it is highly likely to have some problems with the backend mount.

gluster-ant commented 4 years ago

Time: 20191213T06:25:55 sharmaakshay890 at gmail commented: Thanks Rafi,

1). Can i know what could be reason of this i/o error failure??.

We are facing this issues in some of the machines , due to which we are unable to do anything until and unless we reboots that machine. After that everything come backs to normal.

we are unable to find the root cause.

2). I have shared some steps in bug's Additional Info.

can you please let us know, that mentioned steps are correct ? (for creation and cleaning up of Glusterfs).

3) Recommended Backend storage for glusterfs ??

we are using Sata disk with ext4 type.

gluster-ant commented 4 years ago

Time: 20200116T08:23:23 rkavunga at redhat commented: (In reply to sharmaakshay890 from comment #3)

Thanks Rafi,

1). Can i know what could be reason of this i/o error failure??.

We are facing this issues in some of the machines , due to which we are unable to do anything until and unless we reboots that machine. After that everything come backs to normal.

we are unable to find the root cause.

I have to admit it I'm not an expert in disk failure cases. You can may start with the disk health check and then see if the file system is corrupted or not.

2). I have shared some steps in bug's Additional Info.

can you please let us know, that mentioned steps are correct ? (for creation and cleaning up of Glusterfs).

Please refer to https://docs.gluster.org/en/latest/Administrator%20Guide/setting-up-storage/

3) Recommended Backend storage for glusterfs ??

we are using Sata disk with ext4 type.

ext4 is perfectly fine though gluster recommends XFS.

stale[bot] commented 4 years ago

Thank you for your contributions. Noticed that this issue is not having any activity in last ~6 months! We are marking this issue as stale because it has not had recent activity. It will be closed in 2 weeks if no one responds with a comment here.

stale[bot] commented 3 years ago

Closing this issue as there was no update since my last update on issue. If this is an issue which is still valid, feel free to open it.