Now I the volumes are only reachable over this IP. What means if I use local fuse mounts i have to use the corresponding <node-ip> instead of localhost to connect to the volumes. Right?
Now I try to create a geo-replication:
root@gluster1:~# gluster vol geo-replication dev storage7::rep create push-pem
Unable to mount and fetch primary volume details. Please check the log: /var/log/glusterfs/geo-replication/gverify-primarymnt.log
geo-replication command failed
since that didn't work I'll have a look at the gverify-primarymnt.log:
[2023-11-07 09:18:36.045629 +0000] I [MSGID: 100030] [glusterfsd.c:2767:main] 0-glusterfs: Started running version [{arg=glusterfs}, {version=10.4}, {cmdlinestr=glusterfs -s localhost --xlator-option=*dht.lookup-unhashed=off --volfile-id dev -l /var/log/glusterfs/geo-replication/gverify-primarymnt.log /tmp/gverify.sh.keg6Z3}]
[2023-11-07 09:18:36.049175 +0000] I [glusterfsd.c:2447:daemonize] 0-glusterfs: Pid of current running process is 65262
[2023-11-07 09:18:36.061073 +0000] I [MSGID: 101190] [event-epoll.c:667:event_dispatch_epoll_worker] 0-epoll: Started thread with index [{index=0}]
[2023-11-07 09:18:36.061230 +0000] I [MSGID: 101190] [event-epoll.c:667:event_dispatch_epoll_worker] 0-epoll: Started thread with index [{index=1}]
[2023-11-07 09:18:36.061226 +0000] I [glusterfsd-mgmt.c:2673:mgmt_rpc_notify] 0-glusterfsd-mgmt: disconnected from remote-host: localhost
[2023-11-07 09:18:39.062354 +0000] I [glusterfsd-mgmt.c:2712:mgmt_rpc_notify] 0-glusterfsd-mgmt: Exhausted all volfile servers
[2023-11-07 09:18:39.062913 +0000] W [glusterfsd.c:1458:cleanup_and_exit] (-->/lib/x86_64-linux-gnu/libgfrpc.so.0(+0xfef5) [0x7ff83c21fef5] -->glusterfs(+0x153a3) [0x5612e992b3a3] -->glusterfs(cleanup_and_exit+0x58) [0x5612e991f658] ) 0-: received signum (1), shutting down
[2023-11-07 09:18:39.063490 +0000] I [fuse-bridge.c:7065:fini] 0-fuse: Unmounting '/tmp/gverify.sh.keg6Z3'.
[2023-11-07 09:18:39.064312 +0000] I [fuse-bridge.c:7069:fini] 0-fuse: Closing fuse connection to '/tmp/gverify.sh.keg6Z3'.
[2023-11-07 09:18:39.064508 +0000] W [glusterfsd.c:1458:cleanup_and_exit] (-->/lib/x86_64-linux-gnu/libpthread.so.0(+0x8609) [0x7ff83c1d9609] -->glusterfs(glusterfs_sigwaiter+0xcd) [0x5612e991f7dd] -->glusterfs(cleanup_and_exit+0x58) [0x5612e991f658] ) 0-: received signum (15), shutting down
two things jump right out at me:
{cmdlinestr=glusterfs -s localhost --xlator-option=*dht.lookup-unhashed=off --volfile-id dev -l /var/log/glusterfs/geo-replication/gverify-primarymnt.log /tmp/gverify.sh.keg6Z3}
and
[2023-11-07 09:18:36.061226 +0000] I [glusterfsd-mgmt.c:2673:mgmt_rpc_notify] 0-glusterfsd-mgmt: disconnected from remote-host: localhost
Am I wrong or is it trying to establish a connection to the volume dev via localhost?
I can only think of two ways to change this:
The first way is to modify /etc/hosts:
#127.0.0.1 localhost
<node-ip> localhost
[...]
or the second and somewhat more complex way:
Modify /usr/libexec/glusterfs/gverify.sh:
[...]
function primary_stats()
{
[...]
if [ "$inet6" = "inet6" ]; then
glusterfs -s localhost --xlator-option="*dht.lookup-unhashed=off" --xlator-option="transport.address-family=inet6" --volfile-id $PRIMARYVOL -l $primary_log_file $d;
else
# Modifications
get_ip="$(cat /etc/glusterfs/glusterd.vol |grep -P "^[^#].+?((25[0-5]|(2[0-4]|1\d|[1-9]|)\d)\.?\b){4}$" | awk '{print $3}' |uniq)"
volfile_server=""
if ! [ -z $get_ip ]; then
if [ "$(echo $get_ip |wc -l)" -eq 1 ]; then
volfile_server=$get_ip
else
volfile_server="localhost"
fi
else
volfile_server="localhost"
fi
glusterfs -s $volfile_server --xlator-option="*dht.lookup-unhashed=off" --volfile-id $PRIMARYVOL -l $primary_log_file $d;
# Modifications END
# glusterfs -s localhost --xlator-option="*dht.lookup-unhashed=off" --volfile-id $PRIMARYVOL -l $primary_log_file $d;
fi
[...]
}
[...]
If I try now to create a geo-replication it works:
root@gluster1:~# gluster vol geo-replication dev storage7::rep create push-pem
Creating geo-replication session between dev & storage7::rep has been successful
but if I start or try to start the geo-replication:
root@gluster1:/usr/libexec/glusterfs/python/syncdaemon# gluster volume geo-replication dev root@storage7::rep start
Starting geo-replication session between dev & storage7::rep has been successful
But the status remains unchanged at Created:
root@gluster1:~# gluster volume geo-replication dev root@storage7::rep status
PRIMARY NODE PRIMARY VOL PRIMARY BRICK SECONDARY USER SECONDARY SECONDARY NODE STATUS CRAWL STATUS LAST_SYNCED
-----------------------------------------------------------------------------------------------------------------------------------------------
storage1 dev /storage/brick/dev root storage7::rep N/A Created N/A N/A
storage3 dev /storage/brick/dev root storage7::rep N/A Created N/A N/A
storage2 dev /storage/brick/dev root storage7::rep N/A Created N/A N/A
If I now take a look at /var/log/glusterfs/geo-replication/dev_storage7_rep/gsyncd.log I see the following:
[2023-11-08 13:04:18.74454] E [syncdutils(monitor):845:errlog] Popen: command returned error [{cmd=/usr/sbin/gluster --xml --remote-host=localhost volume info dev}, {error=1}]
It may have something to do with /usr/libexec/glusterfs/python/syncdaemon/subcmds.py.
But at this point the question arises whether there is really no other way to get a geo-replication with transport.socket.bind-address and transport.tcp.bind-address to work than messing around in the code?
The operating system / glusterfs version: Ubuntu 20.04.5 LTS / glusterfs 10.4
Description of problem: I have following Setup:
I have a Volume called
dev
which is replicated on gluster1, gluster2 and gluster3. And a Volume calledrep
on gluster7.Everything is configured so that geo-replication would work without any problems.
If I now add the following to the file
/etc/glusterfs/glusterd.vol
on each node:Now I try to create a geo-replication:
since that didn't work I'll have a look at the
gverify-primarymnt.log
:two things jump right out at me:
and
I can only think of two ways to change this:
The first way is to modify
/etc/hosts
:or the second and somewhat more complex way:
Modify
/usr/libexec/glusterfs/gverify.sh
:If I try now to create a geo-replication it works:
but if I start or try to start the geo-replication:
But the status remains unchanged at
Created
:If I now take a look at
/var/log/glusterfs/geo-replication/dev_storage7_rep/gsyncd.log
I see the following:It may have something to do with
/usr/libexec/glusterfs/python/syncdaemon/subcmds.py
.But at this point the question arises whether there is really no other way to get a geo-replication with
transport.socket.bind-address
andtransport.tcp.bind-address
to work than messing around in the code?