gluster / glusterfs

Gluster Filesystem : Build your distributed storage in minutes
https://www.gluster.org
GNU General Public License v2.0
4.69k stars 1.08k forks source link

dht_pt_getxattr does not seem to handle virtual xattrs. #1925

Closed itisravi closed 3 years ago

itisravi commented 3 years ago

@kotreshhr and I were debugging issues during upgrade testing of https://github.com/gluster/glusterfs/pull/1568. One of them is in DHT:

dht_pt_getxattr()which was added via commit b13b7855e5014e916fc64eb7a89c0ad3e8d70901 for perf improvements in pass-through mode (i.e. where distribute-count is 1) does not handle virtual xattrs. For geo-rep there is a virtual getxattr done by geo-rep on the fuse mount via DHT_SUBVOL_STATUS_KEY and it must not be sent over the network. dht_getxattr()handles it but its missing indht_pt_getxattr(). So the geo-rep session goes faulty.

After adding the below check, geo-rep was working again.

diff --git a/xlators/cluster/dht/src/dht-common.c b/xlators/cluster/dht/src/dht-common.c
index 8c326b96c..791221c0e 100644
--- a/xlators/cluster/dht/src/dht-common.c
+++ b/xlators/cluster/dht/src/dht-common.c
@@ -11337,6 +11337,12 @@ int
 dht_pt_getxattr(call_frame_t *frame, xlator_t *this, loc_t *loc,
                 const char *key, dict_t *xdata)
 {
+
+    if (strncmp(key, DHT_SUBVOL_STATUS_KEY, SLEN(DHT_SUBVOL_STATUS_KEY)) == 0) {
+        dht_vgetxattr_subvol_status(frame, this, key);
+        return 0;
+   }
+
     STACK_WIND(frame, dht_pt_getxattr_cbk, FIRST_CHILD(this),
                FIRST_CHILD(this)->fops->getxattr, loc, key, xdata);
     return 0;

@mohit84 What do you think? Interestingly, this does not happen when geo-rep is setup and started for the first time, but only when geo-rep session is stopped, glusterd is restarted and geo-rep session is started again.

mohit84 commented 3 years ago

@kotreshhr and I were debugging issues during upgrade testing of #1568. One of them is in DHT:

dht_pt_getxattr()which was added via commit b13b785 for perf improvements in pass-through mode (i.e. where distribute-count is 1) does not handle virtual xattrs. For geo-rep there is a virtual getxattr done by geo-rep on the fuse mount via DHT_SUBVOL_STATUS_KEY and it must not be sent over the network. dht_getxattr()handles it but its missing indht_pt_getxattr(). So the geo-rep session goes faulty.

After adding the below check, geo-rep was working again.

diff --git a/xlators/cluster/dht/src/dht-common.c b/xlators/cluster/dht/src/dht-common.c
index 8c326b96c..791221c0e 100644
--- a/xlators/cluster/dht/src/dht-common.c
+++ b/xlators/cluster/dht/src/dht-common.c
@@ -11337,6 +11337,12 @@ int
 dht_pt_getxattr(call_frame_t *frame, xlator_t *this, loc_t *loc,
                 const char *key, dict_t *xdata)
 {
+
+    if (strncmp(key, DHT_SUBVOL_STATUS_KEY, SLEN(DHT_SUBVOL_STATUS_KEY)) == 0) {
+        dht_vgetxattr_subvol_status(frame, this, key);
+        return 0;
+   }
+
     STACK_WIND(frame, dht_pt_getxattr_cbk, FIRST_CHILD(this),
                FIRST_CHILD(this)->fops->getxattr, loc, key, xdata);
     return 0;

@mohit84 What do you think? Interestingly, this does not happen when geo-rep is setup and started for the first time, but only when geo-rep session is stopped, glusterd is restarted and geo-rep session is started again.

Yes I full agree with you, the behavior should be similar to while dht_subvol is having more than 1 children.