Closed dannylee- closed 3 years ago
Did you try newer versions of glusterfs? there has been many bugs fixed in newer versions.. so maybe try with 6.8 ?
After a few days of load testing to try to figure out a way to reliably reproduce the issue, I was unable to, so I wouldn't be able to confirm if this bug could be fixed with a 6.8 upgrade. Some of the bug fixes that I thought could be related to this issue were related to the rebalancing feature, which we aren't using.
Thank you for your contributions. Noticed that this issue is not having any activity in last ~6 months! We are marking this issue as stale because it has not had recent activity. It will be closed in 2 weeks if no one responds with a comment here.
Closing this issue as there was no update since my last update on issue. If this is an issue which is still valid, feel free to open it.
Description of problem: Looks very similar to https://github.com/gluster/glusterfs/issues/784 and https://github.com/gluster/glusterfs/issues/783, but different stacktrace (read-ahead instead of open-behind)
The exact command to reproduce the issue: Could not reproduce, but there were a lot of files being read before it crashed.
The stacktrace:
[2020-02-27 15:57:41.059088] W [fuse-bridge.c:1506:fuse_fd_cbk] 0-glusterfs-fuse: 1668556410: OPEN() /somelocation/somefile.l.gz => -1 (Stale file handle)
pending frames:
frame : type(1) op(UNLINK)
frame : type(1) op(OPEN)
patchset: git://git.gluster.org/glusterfs.git
signal received: 11
time of crash:
2020-02-27 15:57:41
configuration details:
argp 1
backtrace 1
dlfcn 1
libpthread 1
llistxattr 1
setfsid 1
spinlock 1
epoll.h 1
xattr.h 1
st_atim.tv_nsec 1
package-string: glusterfs 6.5
The message "W [MSGID: 114031] [client-rpc-fops_v2.c:851:client4_0_setxattr_cbk] 0-company-client-0: remote operation failed" repeated 12333 times between [2020-02-27 15:56:36.703301] and [2020-02-27 15:57:41.721945]
The message "E [MSGID: 148002] [utime.c:146:gf_utime_set_mdata_setxattr_cbk] 0-company-utime: dict set of key for set-ctime-mdata failed" repeated 12333 times between [2020-02-27 15:56:36.703320] and [2020-02-27 15:57:41.721948]
pending frames:
frame : type(1) op(UNLINK)
frame : type(1) op(OPEN)
patchset: git://git.gluster.org/glusterfs.git
signal received: 11
time of crash:
2020-02-27 15:57:41
configuration details:
argp 1
backtrace 1
dlfcn 1
libpthread 1
llistxattr 1
setfsid 1
spinlock 1
epoll.h 1
xattr.h 1
st_atim.tv_nsec 1
package-string: glusterfs 6.5
/lib64/libglusterfs.so.0(+0x27130)[0x7f3910c72130]
/lib64/libglusterfs.so.0(gf_print_trace+0x334)[0x7f3910c7cb34]
/lib64/libc.so.6(+0x363b0)[0x7f390f2af3b0]
/lib64/libuuid.so.1(+0x25b0)[0x7f39103d65b0]
/lib64/libuuid.so.1(+0x2646)[0x7f39103d6646]
/lib64/libglusterfs.so.0(uuid_utoa+0x1c)[0x7f3910c7bcac]
/usr/lib64/glusterfs/6.5/xlator/performance/io-cache.so(+0x5e55)[0x7f39039cce55]
/usr/lib64/glusterfs/6.5/xlator/performance/read-ahead.so(+0x1c16)[0x7f3903df0c16]
/usr/lib64/glusterfs/6.5/xlator/features/utime.so(+0x39ab)[0x7f39083149ab]
/usr/lib64/glusterfs/6.5/xlator/protocol/client.so(+0x73523)[0x7f390884c523]
/lib64/libgfrpc.so.0(+0xf021)[0x7f3910a1c021]
/lib64/libgfrpc.so.0(+0xf387)[0x7f3910a1c387]
/lib64/libgfrpc.so.0(rpc_transport_notify+0x23)[0x7f3910a189f3]
/usr/lib64/glusterfs/6.5/rpc-transport/socket.so(+0xa875)[0x7f390b326875]
/lib64/libglusterfs.so.0(+0x8b806)[0x7f3910cd6806]
/lib64/libpthread.so.0(+0x7e65)[0x7f390fab1e65]
/lib64/libc.so.6(clone+0x6d)[0x7f390f37788d]
Expected results: The client does not crash
Additional info: Before the crash, there were numerous (~4,000) warnings about a "Stale file handle". Something like "W [fuse-bridge.c:1506:fuse_fd_cbk] 0-glusterfs-fuse: 1668523616: OPEN() /somefolder/somefile.l.gz (Stale file handle)". These warning log entries occurred for about 13 minutes right before the crash.
The output of the
Volume Name: company
Type: Replicate
Volume ID: 321e775a-d600-448c-9c0b-ef1a2340d1a9
Status: Started
Snapshot Count: 0
Number of Bricks: 1 x 3 = 3
Transport-type: tcp
Bricks:
Brick1: 10.125.10.251:/somelocation
Brick2: 10.125.9.13:/somelocation
Brick3: 10.125.11.44:/somelocation
Options Reconfigured:
performance.client-io-threads: off
nfs.disable: true
transport.address-family: inet
performance.io-thread-count: 64
diagnostics.brick-log-level: WARNING
storage.fips-mode-rchecksum: on
gluster volume info
command:The operating system / glusterfs version: OS: CentOS 7.7.1908 (Core) GlusterFS Version: 6.5