backube / volsync

Asynchronous data replication for Kubernetes volumes
https://volsync.readthedocs.io
GNU Affero General Public License v3.0
586 stars 67 forks source link

rsync failure: setgid failed #728

Open leipanhz opened 1 year ago

leipanhz commented 1 year ago

I have two K3s cluster on RHEL with ceph and volsync, following the documentation I set up replicationsource and replicationdestiation, Rsync replication failured when I check replicationsource, the following error is observed

logs from the source

2023.04.26 21:26:04 LOG6[ui]: Initializing inetd mode configuration
2023.04.26 21:26:04 LOG7[ui]: Clients allowed=512000
2023.04.26 21:26:04 LOG5[ui]: stunnel 5.62 on x86_64-redhat-linux-gnu platform
2023.04.26 21:26:04 LOG5[ui]: Compiled/running with OpenSSL 3.0.1 14 Dec 2021
2023.04.26 21:26:04 LOG5[ui]: Threading:PTHREAD Sockets:POLL,IPv6 TLS:ENGINE,FIPS,OCSP,PSK,SNI
2023.04.26 21:26:04 LOG7[ui]: errno: (*__errno_location ())
2023.04.26 21:26:04 LOG6[ui]: Initializing inetd mode configuration
2023.04.26 21:26:04 LOG5[ui]: Reading configuration from file /tmp/stunnel-client.conf
2023.04.26 21:26:04 LOG5[ui]: UTF-8 byte order mark not detected
2023.04.26 21:26:04 LOG5[ui]: FIPS mode disabled
2023.04.26 21:26:04 LOG6[ui]: Compression enabled: 0 methods
2023.04.26 21:26:04 LOG7[ui]: No PRNG seeding was required
2023.04.26 21:26:04 LOG6[ui]: Initializing service [rsync]
2023.04.26 21:26:04 LOG6[ui]: PSKsecrets line 1: 64-byte hexadecimal key configured for identity "volsync"
2023.04.26 21:26:04 LOG6[ui]: PSK identities: 1 retrieved
2023.04.26 21:26:04 LOG6[ui]: Using the default TLS version as specified in OpenSSL crypto policies. Not setting explicitly.
2023.04.26 21:26:04 LOG6[ui]: Using the default TLS version as specified in OpenSSL crypto policies. Not setting explicitly
2023.04.26 21:26:04 LOG6[ui]: OpenSSL security level is used: 2
2023.04.26 21:26:04 LOG7[ui]: Ciphers: PSK
2023.04.26 21:26:04 LOG7[ui]: TLSv1.3 ciphersuites: TLS_AES_256_GCM_SHA384:TLS_AES_128_GCM_SHA256:TLS_CHACHA20_POLY1305_SHA256
2023.04.26 21:26:04 LOG7[ui]: TLS options: 0x2100000 (+0x0, -0x0)
2023.04.26 21:26:04 LOG6[ui]: Session resumption enabled
2023.04.26 21:26:04 LOG7[ui]: No certificate or private key specified
2023.04.26 21:26:04 LOG6[ui]: DH initialization skipped: client section
2023.04.26 21:26:04 LOG7[ui]: ECDH initialization
2023.04.26 21:26:04 LOG7[ui]: ECDH initialized with curves X25519:P-256:X448:P-521:P-384
2023.04.26 21:26:04 LOG5[ui]: Configuration successful
2023.04.26 21:26:04 LOG7[ui]: Deallocating deployed section defaults
2023.04.26 21:26:04 LOG7[ui]: Binding service [rsync]
2023.04.26 21:26:04 LOG7[ui]: Listening file descriptor created (FD=8)
2023.04.26 21:26:04 LOG7[ui]: Setting accept socket options (FD=8)
2023.04.26 21:26:04 LOG7[ui]: Option SO_REUSEADDR set on accept socket
2023.04.26 21:26:04 LOG6[ui]: Service [rsync] (FD=8) bound to 127.0.0.1:9000
2023.04.26 21:26:04 LOG7[main]: Created pid file /tmp/stunnel-client.pid
2023.04.26 21:26:04 LOG7[cron]: Cron thread initialized
2023.04.26 21:26:04 LOG6[cron]: Executing cron jobs
2023.04.26 21:26:04 LOG6[cron]: Cron jobs completed in 0 seconds
2023.04.26 21:26:04 LOG7[cron]: Waiting 86400 seconds
Syncing data to 9.46.108.39:8000 ...
2023.04.26 21:26:04 LOG7[main]: Found 1 ready file descriptor(s)
2023.04.26 21:26:04 LOG7[main]: FD=4 events=0x2001 revents=0x0
2023.04.26 21:26:04 LOG7[main]: FD=8 events=0x2001 revents=0x1
2023.04.26 21:26:04 LOG7[main]: Service [rsync] accepted (FD=3) from 127.0.0.1:41816
2023.04.26 21:26:04 LOG7[0]: Service [rsync] started
2023.04.26 21:26:04 LOG7[0]: Setting local socket options (FD=3)
2023.04.26 21:26:04 LOG7[0]: Option TCP_NODELAY set on local socket
2023.04.26 21:26:04 LOG5[0]: Service [rsync] accepted connection from 127.0.0.1:41816
2023.04.26 21:26:04 LOG6[0]: s_connect: connecting 9.46.108.39:8000
2023.04.26 21:26:04 LOG7[0]: s_connect: s_poll_wait 9.46.108.39:8000: waiting 10 seconds
2023.04.26 21:26:04 LOG7[0]: FD=6 events=0x2001 revents=0x0
2023.04.26 21:26:04 LOG7[0]: FD=10 events=0x2005 revents=0x0
2023.04.26 21:26:04 LOG5[0]: s_connect: connected 9.46.108.39:8000
2023.04.26 21:26:04 LOG5[0]: Service [rsync] connected remote server from 10.42.2.88:37512
2023.04.26 21:26:04 LOG7[0]: Setting remote socket options (FD=10)
2023.04.26 21:26:04 LOG7[0]: Option TCP_NODELAY set on remote socket
2023.04.26 21:26:04 LOG7[0]: Remote descriptor (FD=10) initialized
2023.04.26 21:26:04 LOG6[0]: SNI: sending servername: 9.46.108.39
2023.04.26 21:26:04 LOG6[0]: Peer certificate not required
2023.04.26 21:26:04 LOG7[0]: TLS state (connect): before SSL initialization
2023.04.26 21:26:04 LOG7[0]: Initializing application specific data for session authenticated
2023.04.26 21:26:04 LOG6[0]: PSK client configured for identity "volsync"
2023.04.26 21:26:04 LOG7[0]: Initializing application specific data for session authenticated
2023.04.26 21:26:04 LOG7[0]: TLS state (connect): SSLv3/TLS write client hello
2023.04.26 21:26:04 LOG7[0]: TLS state (connect): SSLv3/TLS write client hello
2023.04.26 21:26:04 LOG7[0]: Deallocating application specific data for session connect address
2023.04.26 21:26:04 LOG7[0]: Initializing application specific data for session authenticated
2023.04.26 21:26:04 LOG7[0]: Deallocating application specific data for session connect address
2023.04.26 21:26:04 LOG7[0]: TLS state (connect): SSLv3/TLS read server hello
2023.04.26 21:26:04 LOG7[0]: TLS state (connect): TLSv1.3 read encrypted extensions
2023.04.26 21:26:04 LOG7[0]: TLS state (connect): SSLv3/TLS read finished
2023.04.26 21:26:04 LOG7[0]: TLS state (connect): SSLv3/TLS write change cipher spec
2023.04.26 21:26:04 LOG7[0]: TLS state (connect): SSLv3/TLS write finished
2023.04.26 21:26:04 LOG7[0]:      1 client connect(s) requested
2023.04.26 21:26:04 LOG7[0]:      1 client connect(s) succeeded
2023.04.26 21:26:04 LOG7[0]:      0 client renegotiation(s) requested
2023.04.26 21:26:04 LOG7[0]:      1 session reuse(s)
2023.04.26 21:26:04 LOG6[0]: TLS connected: previous session reused
2023.04.26 21:26:04 LOG6[0]: TLSv1.3 ciphersuite: TLS_AES_128_GCM_SHA256 (128-bit encryption)
2023.04.26 21:26:04 LOG6[0]: Peer temporary key: X25519, 253 bits
2023.04.26 21:26:04 LOG7[0]: Compression: null, expansion: null
2023.04.26 21:26:04 LOG6[0]: Session id:
2023.04.26 21:26:04 LOG7[0]: TLS state (connect): SSL negotiation finished successfully
2023.04.26 21:26:04 LOG7[0]: TLS state (connect): SSL negotiation finished successfully
2023.04.26 21:26:04 LOG7[0]: Initializing application specific data for session authenticated
2023.04.26 21:26:04 LOG7[0]: Deallocating application specific data for session connect address
2023.04.26 21:26:04 LOG7[0]: New session callback
2023.04.26 21:26:04 LOG6[0]: No peer certificate received
2023.04.26 21:26:04 LOG6[0]: Session id: 02C3DE98488C7E8880B2BC48048CA7E3735ED410F5D5603ABD6035111288293B
2023.04.26 21:26:04 LOG7[0]: TLS state (connect): SSLv3/TLS read server session ticket
@ERROR: setgid failed
rsync error: error starting client-server protocol (code 5) at main.c(1821) [sender=3.2.3]
2023.04.26 21:26:04 LOG6[0]: Read socket closed (readsocket)
2023.04.26 21:26:04 LOG7[0]: Sending close_notify alert

My configuration for destination

apiVersion: volsync.backube/v1alpha1
kind: ReplicationDestination
metadata:
  annotations:
    kubectl.kubernetes.io/last-applied-configuration: |
      {"apiVersion":"volsync.backube/v1alpha1","kind":"ReplicationDestination","metadata":{"annotations":{},"name":"dst","namespace":"volsync-system"},"spec":{"rsyncTLS":{"accessModes":["ReadWriteOnce"],"capacity":"10Gi","copyMethod":"Snapshot","serviceType":"LoadBalancer","storageClassName":"rook-ceph-block","volumeSnapshotClassName":"csi-rbdplugin-mysql-snapclass"}}}
  creationTimestamp: "2023-04-26T00:31:47Z"
  generation: 1
  name: dst
  namespace: volsync-system
  resourceVersion: "466470"
  uid: 1305d7ba-6d09-4cc2-b589-9b2e3f054a9d
spec:
  rsyncTLS:
    accessModes:
    - ReadWriteOnce
    capacity: 10Gi
    copyMethod: Snapshot
    serviceType: LoadBalancer
    storageClassName: rook-ceph-block
    volumeSnapshotClassName: csi-rbdplugin-mysql-snapclass
status:
  conditions:
  - lastTransitionTime: "2023-04-26T00:31:47Z"
    message: Synchronization in-progress
    reason: SyncInProgress
    status: "True"
    type: Synchronizing
  lastSyncStartTime: "2023-04-26T00:31:47Z"
  latestMoverStatus: {}
  rsyncTLS:
    address: xxx
    keySecret: volsync-rsync-tls-dst

On destination node

% kc get svc
NAME                        TYPE           CLUSTER-IP      EXTERNAL-IP                                       PORT(S)          AGE
volsync-metrics             ClusterIP      10.43.194.190   <none>                                            8443/TCP         21h
volsync-rsync-tls-dst-dst   LoadBalancer   10.43.85.20     xxxx  8000:30461/TCP   15h
(base) leipan@Leis-MBP-5 volsync %
(base) leipan@Leis-MBP-5 volsync %
(base) leipan@Leis-MBP-5 volsync %
(base) leipan@Leis-MBP-5 volsync % kc get pod
NAME                              READY   STATUS    RESTARTS   AGE
volsync-669c76cf4c-2745s          2/2     Running   0          21h
volsync-rsync-tls-dst-dst-2c2gc   1/1     Running   0          15h

The error is related to authentication

tesshuflower commented 1 year ago

Is your PVC a block volume? VolSync currently only supports PVCs with VolumeMode of Filesystem.

VolSync with the rsync-TLS mover has been tested with cephfs but it looks like you are using rook-ceph-block.

leipanhz commented 1 year ago

@

Is your PVC a block volume? VolSync currently only supports PVCs with VolumeMode of Filesystem.

VolSync with the rsync-TLS mover has been tested with cephfs but it looks like you are using rook-ceph-block.

I see, my PVC is a block volume. Does volsync support Object Storage, or FileSystem only?

JohnStrunk commented 1 year ago

Just to be clear, RWO (rbd) volumes are supported as long as they are volumeMode: Filesystem not volumeMode: Block.

The above error looks like you ran into this: https://volsync.readthedocs.io/en/latest/usage/rsync-tls/index.html#rsync-tls-mover-permissions.

You'd need to do one of: