gluster / glusterfs

Gluster Filesystem : Build your distributed storage in minutes
https://www.gluster.org
GNU General Public License v2.0
4.54k stars 1.07k forks source link

Flooded with error=Permission Denied after self-heal completed in adding new brick #4266

Open nimaabgh opened 7 months ago

nimaabgh commented 7 months ago

Description of problem:

I added a new node in the replica Gluster volume and after the healing process ended, I faced a flood of permission denied (Invalid ACL) on the new node brick logs. When I CHOWN the directory for a specific user on the client side, the problem was solved but after 10-20 minutes, the errors occurred again!! I don't mount the gluster with the ACLs. The other two nodes don't have any similar problems.

My environment:

Status: Started Snapshot Count: 0 Number of Bricks: 1 x 3 = 3

Brick1: gluster-01:/opt/gluster-volume/gv0 ---> (glusterfs 9.4) Brick2: gluster-02:/opt/gluster-volume/gv0 --> (glusterfs 10.4) Brick3: gluster-03:/opt/gluster-volume/gv0 --> (glusterfs 10.4)

Options Reconfigured: performance.client-io-threads: off nfs.disable: on transport.address-family: inet storage.fips-mode-rchecksum: on cluster.granular-entry-heal: on cluster.shd-max-threads: 4 cluster.data-self-heal: on cluster.metadata-self-heal: on cluster.entry-self-heal: on cluster.self-heal-daemon: on cluster.lookup-optimize: on features.acl: disbale

The exact command to reproduce the issue:

tail -f /var/log/glusterfs/bricks/opt-gluster-volume-gv0.log

The full output of the command that failed:

[2023-11-15 12:11:36.660499 +0000] E [MSGID: 115050] [server-rpc-fops_v2.c:149:server4_lookup_cbk] 0-gv0-server: LOOKUP info [{frame=153313103}, {path=/2023/11/test.3}, {uuid_utoa=d77b15cd-b706-4afe-832f-9b8189ba6387}, {bname=test.3}, {client=CTX_ID:f85eb878-cb4d-45fb-a90b-1278af0d96eb-GRAPH_ID:4-PID:1176-HOST:x.b.com_NAME:gv0-client-3-RECON_NO:-0}, {error-xlator=gv0-access-control}, {errno=13}, {error=Permission denied}]

Expected results:

**Mandatory info:** **- The output of the `gluster volume info` command**: ```Volume Name: gv0 Type: Replicate Volume ID: 68f4d8ec-6acb-4af8-b13f-1449fdd28fbb Status: Started Snapshot Count: 0 Number of Bricks: 1 x 3 = 3 Transport-type: tcp Bricks: Brick1: gluster-01:/opt/gluster-volume/gv0 Brick2: gluster-02:/opt/gluster-volume/gv0 Brick3: gluster-03:/opt/gluster-volume/gv0 Options Reconfigured: performance.client-io-threads: off nfs.disable: on transport.address-family: inet storage.fips-mode-rchecksum: on cluster.granular-entry-heal: on cluster.shd-max-threads: 4 cluster.data-self-heal: on cluster.metadata-self-heal: on cluster.entry-self-heal: on cluster.self-heal-daemon: on cluster.lookup-optimize: on features.acl: disbale ``` **- The output of the `gluster volume status` command**: ``` Status of volume: gv0 Gluster process TCP Port RDMA Port Online Pid ------------------------------------------------------------------------------ Brick gluster-01:/opt/glust er-volume/gv0 49152 0 Y 25626 Brick gluster-02:/opt/gluste r-volume/gv0 49152 0 Y 1086670 Brick gluster-03:/opt/glust er-volume/gv0 49234 0 Y 4011 Self-heal Daemon on localhost N/A N/A Y 4027 Self-heal Daemon on gluster-01 N/A N/A Y 158309 Self-heal Daemon on gluster-02 N/A N/A Y 508479 Task Status of Volume gv0 ------------------------------------------------------------------------------ There are no active volume tasks ``` **- The output of the `gluster volume heal` command**: ``` Brick gluster-01:/opt/gluster-volume/gv0 Status: Connected Number of entries: 0 Brickgluster-02:/opt/gluster-volume/gv0 Status: Connected Number of entries: 0 Brick gluster-03:/opt/gluster-volume/gv0 Status: Connected Number of entries: 0 ``` **- Provide logs present on following locations of client and server nodes - /var/log/glusterfs/ On the **Client** side: ``` [2023-11-15 13:03:06.356267] W [MSGID: 114031] [client-rpc-fops_v2.c:2657:client4_0_lookup_cbk] 4-gv0-client-3: remote operation failed. Path: /2023/11 (9043f392-03f2-441b-bcf9-32ef7ecb43ad) [Permission denied] [2023-11-15 13:03:06.358871] W [MSGID: 114031] [client-rpc-fops_v2.c:2657:client4_0_lookup_cbk] 4-gv0-client-3: remote operation failed. Path: (null) (00000000-0000-0000-0000-000000000000) [Permission denied] ``` On the Server side(the node that have issues): ``` [2023-11-15 12:11:36.660499 +0000] E [MSGID: 115050] [server-rpc-fops_v2.c:149:server4_lookup_cbk] 0-gv0-server: LOOKUP info [{frame=153313103}, {path=/2023/11/test.3}, {uuid_utoa=d77b15cd-b706-4afe-832f-9b8189ba6387}, {bname=test.3}, {client=CTX_ID:f85eb878-cb4d-45fb-a90b-1278af0d96eb-GRAPH_ID:4-PID:1176-HOST:x.b.com_NAME:gv0-client-3-RECON_NO:-0}, {error-xlator=gv0-access-control}, {errno=13}, {error=Permission denied}] ``` **- Is there any crash ? Provide the backtrace and coredump **Additional info:**

- The operating system / glusterfs version:

     gluster-01  ---> glusterfs 9.4 --> CentOS Linux release 8.4.2105
     gluster-02  ---> glusterfs 10.4 --> Rocky Linux release 9.2 (Blue Onyx)
     gluster-03  ---> glusterfs 10.4 --> Rocky Linux release 9.2 (Blue Onyx)

Note: Please hide any confidential data which you don't want to share in public like IP address, file name, hostname or any other configuration