gluster / glusterfs

Gluster Filesystem : Build your distributed storage in minutes
https://www.gluster.org
GNU General Public License v2.0
4.75k stars 1.08k forks source link

Permission Denied in logs #832

Closed amarts closed 4 years ago

amarts commented 4 years ago

Description of problem:

I'm preparing the upgrade for our 5.10 gluster cluster running 1x4 replicate volumes to 7.X and decided to upgrade our test cluster first.

As soon as I upgraded to 7.0 (and now 7.1), I started seeing the following messages every 10 minutes in the log for one of the volumes:

[2019-12-19 21:27:55.041949] W [MSGID: 114031] [client-rpc-fops_v2.c:850:client4_0_setxattr_cbk] 0-dev_SNIP_data-client-0: remote operation failed [Permission denied] [2019-12-19 21:27:55.042002] W [MSGID: 114031] [client-rpc-fops_v2.c:850:client4_0_setxattr_cbk] 0-dev_SNIP_data-client-2: remote operation failed [Permission denied] [2019-12-19 21:27:55.042013] W [MSGID: 114031] [client-rpc-fops_v2.c:850:client4_0_setxattr_cbk] 0-dev_SNIP_data-client-1: remote operation failed [Permission denied] [2019-12-19 21:27:55.042634] E [MSGID: 148002] [utime.c:146:gf_utime_set_mdata_setxattr_cbk] 0-dev_SNIP_data-utime: dict set of key for set-ctime-mdata failed [Permission denied] The message "W [MSGID: 114031] [client-rpc-fops_v2.c:850:client4_0_setxattr_cbk] 0-dev_SNIP_data-client-0: remote operation failed [Permission denied]" repeated 2 times between [2019-12-19 21:27:55.041949] and [2019-12-19 21:27:55.047300] The message "W [MSGID: 114031] [client-rpc-fops_v2.c:850:client4_0_setxattr_cbk] 0-dev_SNIP_data-client-2: remote operation failed [Permission denied]" repeated 2 times between [2019-12-19 21:27:55.042002] and [2019-12-19 21:27:55.047312] The message "W [MSGID: 114031] [client-rpc-fops_v2.c:850:client4_0_setxattr_cbk] 0-dev_SNIP_data-client-1: remote operation failed [Permission denied]" repeated 2 times between [2019-12-19 21:27:55.042013] and [2019-12-19 21:27:55.047524] The message "E [MSGID: 148002] [utime.c:146:gf_utime_set_mdata_setxattr_cbk] 0-dev_SNIP_data-utime: dict set of key for set-ctime-mdata failed [Permission denied]" repeated 2 times between [2019-12-19 21:27:55.042634] and [2019-12-19 21:27:55.047887]

[2019-12-19 21:37:55.541329] W [MSGID: 114031] [client-rpc-fops_v2.c:850:client4_0_setxattr_cbk] 0-dev_SNIP_data-client-2: remote operation failed [Permission denied] [2019-12-19 21:37:55.541644] W [MSGID: 114031] [client-rpc-fops_v2.c:850:client4_0_setxattr_cbk] 0-dev_SNIP_data-client-1: remote operation failed [Permission denied] [2019-12-19 21:37:55.541681] W [MSGID: 114031] [client-rpc-fops_v2.c:850:client4_0_setxattr_cbk] 0-dev_SNIP_data-client-0: remote operation failed [Permission denied] [2019-12-19 21:37:55.542067] E [MSGID: 148002] [utime.c:146:gf_utime_set_mdata_setxattr_cbk] 0-dev_SNIP_data-utime: dict set of key for set-ctime-mdata failed [Permission denied] The message "W [MSGID: 114031] [client-rpc-fops_v2.c:850:client4_0_setxattr_cbk] 0-dev_SNIP_data-client-2: remote operation failed [Permission denied]" repeated 3 times between [2019-12-19 21:37:55.541329] and [2019-12-19 21:37:55.546695] The message "W [MSGID: 114031] [client-rpc-fops_v2.c:850:client4_0_setxattr_cbk] 0-dev_SNIP_data-client-1: remote operation failed [Permission denied]" repeated 3 times between [2019-12-19 21:37:55.541644] and [2019-12-19 21:37:55.546711] The message "W [MSGID: 114031] [client-rpc-fops_v2.c:850:client4_0_setxattr_cbk] 0-dev_SNIP_data-client-0: remote operation failed [Permission denied]" repeated 3 times between [2019-12-19 21:37:55.541681] and [2019-12-19 21:37:55.546761] The message "E [MSGID: 148002] [utime.c:146:gf_utime_set_mdata_setxattr_cbk] 0-dev_SNIP_data-utime: dict set of key for set-ctime-mdata failed [Permission denied]" repeated 3 times between [2019-12-19 21:37:55.542067] and [2019-12-19 21:37:55.547042]

etc.

The questions are: Is it a cause for concern? They weren't there before the upgrade. How can I determine what's causing the errors? How can I fix them and prevent them from spamming the logs?

Expected results:

No concerning logs

The operating system / glusterfs version: 7.x

gluster-ant commented 4 years ago

A patch https://review.gluster.org/24200 has been posted that references this issue.

utime: resolve an issue of permission denied logs

In case where uid is not set to be 0, there are possible errors from acl xlator.

The message "E [MSGID: 148002] [utime.c:146:gf_utime_set_mdata_setxattr_cbk] 0-dev_SNIP_data-utime: dict set of key for set-ctime-mdata failed [Permission denied]" repeated 2 times between [2019-12-19 21:27:55.042634] and [2019-12-19 21:27:55.047887]

Change-Id: Ieadf329835a40a13ac0bf908dac776e66954466c Fixes: #832 Signed-off-by: Amar Tumballi amar@kadalu.io

gluster-ant commented 4 years ago

A patch https://review.gluster.org/24200 has been posted that references this issue.

utime: resolve an issue of permission denied logs

In case where uid is not set to be 0, there are possible errors from acl xlator.

The message "E [MSGID: 148002] [utime.c:146:gf_utime_set_mdata_setxattr_cbk] 0-dev_SNIP_data-utime: dict set of key for set-ctime-mdata failed [Permission denied]" repeated 2 times between [2019-12-19 21:27:55.042634] and [2019-12-19 21:27:55.047887]

Change-Id: Ieadf329835a40a13ac0bf908dac776e66954466c Fixes: #832 Signed-off-by: Amar Tumballi amar@kadalu.io

gluster-ant commented 4 years ago

A patch https://review.gluster.org/24200 has been posted that references this issue.

utime: resolve an issue of permission denied logs

In case where uid is not set to be 0, there are possible errors from acl xlator. So, set uid = 0; with pid indicating this is set from UTIME activity.

The message "E [MSGID: 148002] [utime.c:146:gf_utime_set_mdata_setxattr_cbk] 0-dev_SNIP_data-utime: dict set of key for set-ctime-mdata failed [Permission denied]" repeated 2 times between [2019-12-19 21:27:55.042634] and [2019-12-19 21:27:55.047887]

Change-Id: Ieadf329835a40a13ac0bf908dac776e66954466c Fixes: #832 Signed-off-by: Amar Tumballi amar@kadalu.io

gluster-ant commented 4 years ago

A patch https://review.gluster.org/24264 has been posted that references this issue.

utime: resolve an issue of permission denied logs

In case where uid is not set to be 0, there are possible errors from acl xlator. So, set uid = 0; with pid indicating this is set from UTIME activity.

The message "E [MSGID: 148002] [utime.c:146:gf_utime_set_mdata_setxattr_cbk] 0-dev_SNIP_data-utime: dict set of key for set-ctime-mdata failed [Permission denied]" repeated 2 times between [2019-12-19 21:27:55.042634] and [2019-12-19 21:27:55.047887]

Change-Id: Ieadf329835a40a13ac0bf908dac776e66954466c Fixes: #832 Signed-off-by: Amar Tumballi amar@kadalu.io (cherry picked from commit eb916c057036db8289b41265797e5dce066d1512)

gluster-ant commented 4 years ago

A patch https://review.gluster.org/24264 has been posted that references this issue.

utime: resolve an issue of permission denied logs

In case where uid is not set to be 0, there are possible errors from acl xlator. So, set uid = 0; with pid indicating this is set from UTIME activity.

The message "E [MSGID: 148002] [utime.c:146:gf_utime_set_mdata_setxattr_cbk] 0-dev_SNIP_data-utime: dict set of key for set-ctime-mdata failed [Permission denied]" repeated 2 times between [2019-12-19 21:27:55.042634] and [2019-12-19 21:27:55.047887]

Change-Id: Ieadf329835a40a13ac0bf908dac776e66954466c Fixes: #832 Signed-off-by: Amar Tumballi amar@kadalu.io (cherry picked from commit eb916c057036db8289b41265797e5dce066d1512)

pranithk commented 4 years ago

This fix lead to the following crash: We can't trust frame after winding. I fixed it by doing a copy_frame. Will send the patch in a bit

(gdb) t 1 [Switching to thread 1 (Thread 0x7fd89f16d700 (LWP 117524))]

0 0x00007fd89e95c1ca in gf_utime_set_mdata_lookup_cbk (frame=0x7fd888001f50, cookie=0x7fd888002200, this=0x7fd89000e3c0, op_ret=0, op_errno=0, inode=0x7fd890041d20, stbuf=0x7fd89f16b9b0,

xdata=0x7fd890003900, postparent=0x7fd89f16b910) at ../../../../xlators/features/utime/src/utime.c:204

204 frame->root->uid = uid; (gdb) p frame $1 = (call_frame_t ) 0x7fd888001f50 (gdb) p frame->root $2 = (call_stack_t ) 0xdeadc0de00 (gdb) thread apply all bt Thread 1 (Thread 0x7fd89f16d700 (LWP 117524)):

0 0x00007fd89e95c1ca in gf_utime_set_mdata_lookup_cbk (frame=0x7fd888001f50, cookie=0x7fd888002200, this=0x7fd89000e3c0, op_ret=0, op_errno=0, inode=0x7fd890041d20, stbuf=0x7fd89f16b9b0, xdata=0x7fd890003900, postparent=0x7fd89f16b910) at ../../../../xlators/features/utime/src/utime.c:204

1 0x00007fd89e8791ed in client4_0_lookup_cbk (req=0x7fd888002ee0, iov=0x7fd888002f10, count=1, myframe=0x7fd888002200) at client-rpc-fops_v2.c:2632

2 0x00007fd8b1ad968c in rpc_clnt_handle_reply (clnt=0x7fd890046930, pollin=0x7fd890049df0) at rpc-clnt.c:768

3 0x00007fd8b1ad9bbd in rpc_clnt_notify (trans=0x7fd890046c30, mydata=0x7fd890046960, event=RPC_TRANSPORT_MSG_RECEIVED, data=0x7fd890049df0) at rpc-clnt.c:935

4 0x00007fd8b1ad59ec in rpc_transport_notify (this=0x7fd890046c30, event=RPC_TRANSPORT_MSG_RECEIVED, data=0x7fd890049df0) at rpc-transport.c:520

5 0x00007fd89f17aa34 in socket_event_poll_in_async (xl=0x7fd8900090f0, async=0x7fd890049f08) at socket.c:2502

6 0x00007fd89f172e17 in gf_async (async=0x7fd890049f08, xl=0x7fd8900090f0, cbk=0x7fd89f17a9dd ) at ../../../../libglusterfs/src/glusterfs/async.h:189

7 0x00007fd89f17abc3 in socket_event_poll_in (this=0x7fd890046c30, notify_handled=true) at socket.c:2543

8 0x00007fd89f17ba85 in socket_event_handler (fd=9, idx=2, gen=1, data=0x7fd890046c30, poll_in=1, poll_out=0, poll_err=0, event_thread_died=0 '\000') at socket.c:2934

9 0x00007fd8b1a01536 in event_dispatch_epoll_handler (event_pool=0x603cf0, event=0x7fd89f16c02c) at event-epoll.c:640

10 0x00007fd8b1a01a3d in event_dispatch_epoll_worker (data=0x7fd898000d80) at event-epoll.c:751

11 0x00007fd8b178e4e2 in start_thread () from /lib64/libpthread.so.0

12 0x00007fd8b1c176d3 in clone () from /lib64/libc.so.6

amarts commented 4 years ago

Thanks for this. Yes, we shouldn't use frame after unwind.

gluster-ant commented 4 years ago

A patch https://review.gluster.org/24282 has been posted that references this issue.

features/utime: Don't access frame after stack-wind

Problem: frame is accessed after stack-wind. This can lead to crash if the cbk frees the frame.

Fix: Use new frame for the wind instead.

Updates: #832 Change-Id: I64754609f1114b0bbd4d1336fa81a56f2cca6e03 Signed-off-by: Pranith Kumar K pkarampu@redhat.com

gluster-ant commented 4 years ago

A patch https://review.gluster.org/24289 has been posted that references this issue.

features/utime: Don't access frame after stack-wind

Problem: frame is accessed after stack-wind. This can lead to crash if the cbk frees the frame.

Fix: Use new frame for the wind instead.

Fixes: #832 Change-Id: I64754609f1114b0bbd4d1336fa81a56f2cca6e03 Signed-off-by: Pranith Kumar K pkarampu@redhat.com

gluster-ant commented 4 years ago

A patch https://review.gluster.org/24329 has been posted that references this issue.

utime: resolve an issue of permission denied logs

In case where uid is not set to be 0, there are possible errors from acl xlator. So, set uid = 0; with pid indicating this is set from UTIME activity.

The message "E [MSGID: 148002] [utime.c:146:gf_utime_set_mdata_setxattr_cbk] 0-dev_SNIP_data-utime: dict set of key for set-ctime-mdata failed [Permission denied]" repeated 2 times between [2019-12-19 21:27:55.042634] and [2019-12-19 21:27:55.047887]

Change-Id: Ieadf329835a40a13ac0bf908dac776e66954466c Fixes: #832 Signed-off-by: Amar Tumballi amar@kadalu.io (cherry picked from commit eb916c057036db8289b41265797e5dce066d1512)

gluster-ant commented 4 years ago

A patch https://review.gluster.org/24330 has been posted that references this issue.

features/utime: Don't access frame after stack-wind

Problem: frame is accessed after stack-wind. This can lead to crash if the cbk frees the frame.

Fix: Use new frame for the wind instead.

Updates: #832 Change-Id: I64754609f1114b0bbd4d1336fa81a56f2cca6e03 Signed-off-by: Pranith Kumar K pkarampu@redhat.com

gluster-ant commented 4 years ago

A patch https://review.gluster.org/24330 has been posted that references this issue.

features/utime: Don't access frame after stack-wind

Problem: frame is accessed after stack-wind. This can lead to crash if the cbk frees the frame.

Fix: Use new frame for the wind instead.

Updates: #832 Change-Id: I64754609f1114b0bbd4d1336fa81a56f2cca6e03 Signed-off-by: Pranith Kumar K pkarampu@redhat.com

stale[bot] commented 4 years ago

Thank you for your contributions. Noticed that this issue is not having any activity in last ~6 months! We are marking this issue as stale because it has not had recent activity. It will be closed in 2 weeks if no one responds with a comment here.

stale[bot] commented 4 years ago

Closing this issue as there was no update since my last update on issue. If this is an issue which is still valid, feel free to open it.