abbbi / virtnbdbackup

Backup utility for Libvirt / qemu / kvm supporting incremental and differential backups + instant recovery (agentless).
http://libvirtbackup.grinser.de/
GNU General Public License v3.0
330 stars 46 forks source link

connecting to remote ssh fails: [Unknown exception occured: [Errno 16] Device or resource busy] #92

Closed imbmf closed 1 year ago

imbmf commented 1 year ago

Version used 1.9.10

Describe the bug Greetings I am getting this error [Unknown exception occured: [Errno 16] Device or resource busy] when triyng to perform remote backup over ssh. Passwordless ssh is set up on both sides

Expected behavior backup to finish

Hypervisor information:

Logfiles: Connection to 10.10.100.7 closed. root@mini-nas:~# virtnbdbackup -U qemu+ssh://root@10.10.100.7/system -d debian11 -l full -o /ssd-raid/backups/ [2023-01-04 07:34:28] INFO common - printVersion [MainThread]: Version: 1.9.10 Arguments: /usr/bin/virtnbdbackup -U qemu+ssh://root@10.10.100.7/system -d debian11 -l full -o /ssd-raid/backups/ [2023-01-04 07:34:28] INFO virtnbdbackup - main [MainThread]: Backup level: [full] [2023-01-04 07:34:29] INFO client - _connect [MainThread]: Connected to remote host: [hv-02.kokal.local], local host: [mini-nas] [2023-01-04 07:34:29] INFO virtnbdbackup - main [MainThread]: Libvirt library version: [8010000] [2023-01-04 07:34:29] INFO virtnbdbackup - main [MainThread]: Domain has [1] disks attached which support changed block tracking. [2023-01-04 07:34:29] INFO virtnbdbackup - main [MainThread]: Concurrent backup processes: [1] [2023-01-04 07:34:29] INFO client - redefineCheckpoints [MainThread]: Loading checkpoint list from: [/ssd-raid/backups//checkpoints] [2023-01-04 07:34:29] INFO virtnbdbackup - handleCheckpoints [MainThread]: Checkpoint handling. [2023-01-04 07:34:29] INFO virtnbdbackup - handleCheckpoints [MainThread]: Using checkpoint name: [virtnbdbackup.0]. [2023-01-04 07:34:29] INFO client - connect [MainThread]: Connecting remote system via ssh, username: [root] [2023-01-04 07:34:29] WARNING common - sshSession [MainThread]: Failed to setup SSH connection: [Unknown exception occured: [Errno 16] Device or resource busy] [2023-01-04 07:34:29] ERROR virtnbdbackup - main [MainThread]: Remote backup detected but ssh session setup failed

Workaround: none

abbbi commented 1 year ago

never seen this, looks like an issue with your underlying libvirt stack on the source system.. Try using the provided URI (qemu+ssh://root@10.10.100.7/system) with virsh. It will probably result in the same problem:

What happens if executing following command on mini-nas:

virsh -c qemu+ssh://root@10.10.100.7/system

imbmf commented 1 year ago

It was not clear for me that i need to have virsh installed on both systems after installation on mini-nas (libvirtd (libvirt) 8.10.0) and executing command i get this otput: root@mini-nas:~# virsh -c qemu+ssh://root@10.10.100.7/system Welcome to virsh, the virtualization interactive terminal.

Type: 'help' for help with commands 'quit' to quit but still no luck with backup same error message.

abbbi commented 1 year ago

can you also try to execute virtnbdbackup with "-v" option? It should add some more debug information to the ssh session setup and make paramiko show some more verbose output.

imbmf commented 1 year ago

root@mini-nas:~# python3 Python 3.10.9 (main, Dec 7 2022, 13:47:07) [GCC 12.2.0] on linux Type "help", "copyright", "credits" or "license" for more information.

import libvirt libvirt.open('qemu+ssh://root@10.10.100.7/system') <libvirt.virConnect object at 0xb67b0ee0> Output with -v root@mini-nas:~# virtnbdbackup -v -U qemu+ssh://root@10.10.100.7/system -d debian11 -l full -o /ssd-raid/backups/ [2023-01-04 08:37:24] INFO common - printVersion [MainThread]: Version: 1.9.10 Arguments: /usr/bin/virtnbdbackup -v -U qemu+ssh://root@10.10.100.7/system -d debian11 -l full -o /ssd-raid/backups/ [2023-01-04 08:37:24] INFO virtnbdbackup - main [MainThread]: Backup level: [full] [2023-01-04 08:37:24] DEBUG client - _connect [MainThread]: Libvirt URI: [qemu+ssh://root@10.10.100.7/system] [2023-01-04 08:37:24] DEBUG client - _connect [MainThread]: Login information specified, connect libvirtd using openAuth function. [2023-01-04 08:37:25] INFO client - _connect [MainThread]: Connected to remote host: [hv-02.kokal.local], local host: [mini-nas] [2023-01-04 08:37:25] INFO virtnbdbackup - main [MainThread]: Libvirt library version: [8010000] [2023-01-04 08:37:25] DEBUG client - getDomainDisks [MainThread]: Disk config file notation [2023-01-04 08:37:25] DEBUG client - getDomainInfo [MainThread]: No setting [loader] found: 'NoneType' object has no attribute 'text' [2023-01-04 08:37:25] DEBUG client - getDomainInfo [MainThread]: No setting [nvram] found: 'NoneType' object has no attribute 'text' [2023-01-04 08:37:25] DEBUG client - getDomainInfo [MainThread]: No setting [kernel] found: 'NoneType' object has no attribute 'text' [2023-01-04 08:37:25] DEBUG client - getDomainInfo [MainThread]: No setting [initrd] found: 'NoneType' object has no attribute 'text' [2023-01-04 08:37:25] DEBUG client - getDomainInfo [MainThread]: Domain Info: [{}] [2023-01-04 08:37:25] INFO virtnbdbackup - main [MainThread]: Domain has [1] disks attached which support changed block tracking. [2023-01-04 08:37:25] INFO virtnbdbackup - main [MainThread]: Concurrent backup processes: [1] [2023-01-04 08:37:25] INFO client - redefineCheckpoints [MainThread]: Loading checkpoint list from: [/ssd-raid/backups//checkpoints] [2023-01-04 08:37:25] INFO virtnbdbackup - handleCheckpoints [MainThread]: Checkpoint handling. [2023-01-04 08:37:25] DEBUG client - removeAllCheckpoints [MainThread]: Cleaning up persistent storage /ssd-raid/backups//checkpoints [2023-01-04 08:37:25] INFO virtnbdbackup - handleCheckpoints [MainThread]: Using checkpoint name: [virtnbdbackup.0]. [2023-01-04 08:37:25] DEBUG virtnbdbackup - handleCheckpoints [MainThread]: Checkpoint info: [{'name': 'virtnbdbackup.0', 'parent': '', 'file': '/ssd-raid/backups//debian11.cpt'}]. [2023-01-04 08:37:25] INFO client - connect [MainThread]: Connecting remote system via ssh, username: [root] [2023-01-04 08:37:25] WARNING common - sshSession [MainThread]: Failed to setup SSH connection: [Unknown exception occured: [Errno 16] Device or resource busy] [2023-01-04 08:37:25] ERROR virtnbdbackup - main [MainThread]: Remote backup detected but ssh session setup failed

abbbi commented 1 year ago

thanks, apparently it bails outhere:

https://github.com/abbbi/virtnbdbackup/blob/master/libvirtnbdbackup/ssh/client.py#L72

so the paramiko plugin on that system has an issue establishing the connection. Maybe try to add an traceback here or try to create a minimal program using paramiko to connect the remote system to debug the issue. I suspect it might be because of different ssh versions (maybe non supported ciphers on the remote system or something like that?)

It may also be that on the system you are executing, the paramiko module is disfunctional. It looks like it is not even attempting to open the ssh connection and fails very early in the process, otherwise you would get some clean debug output of the paramiko module what it attempts to connect.

abbbi commented 1 year ago

try something like this:

python3
Python 3.9.2 (default, Feb 28 2021, 17:03:44) 
[GCC 10.2.1 20210110] on linux
Type "help", "copyright", "credits" or "license" for more information.
from paramiko import AutoAddPolicy, SSHClient
cli = SSHClient()
cli.load_system_host_keys()
cli.set_missing_host_key_policy(AutoAddPolicy())
cli.connect("10.10.100.7", username="root", timeout=5000)
imbmf commented 1 year ago

this is output: root@mini-nas:~# python3 Python 3.10.9 (main, Dec 7 2022, 13:47:07) [GCC 12.2.0] on linux Type "help", "copyright", "credits" or "license" for more information.

from paramiko import AutoAddPolicy, SSHClient cli = SSHClient() cli.load_system_host_keys() cli.set_missing_host_key_policy(AutoAddPolicy()) cli.connect("10.10.100.7", username="root", timeout=5000)

abbbi commented 1 year ago

no error whatsoever? did it even execute or did you also copy and paste ">>>"?

imbmf commented 1 year ago

mini-nas

abbbi commented 1 year ago

okey, this becomes really strange now. Same code is done in virtnbdbackup and fails. Which interpreter is virtnbdbackup using on the system?

head -1 /usr/bin/virtnbdbackup
which python3

maybe its another interpreter. Which linux distribution used? Anything logged in the dmesg or syslog on the system? Maybe Apparmor or SELinux messing with things? Im really out of ideas now, sorry.

imbmf commented 1 year ago

root@mini-nas:~# head -1 /usr/bin/virtnbdbackup

!/usr/bin/python3

root@mini-nas:~# which python3 /usr/bin/python3

My mini-nas is 32 bit system maybe this is the issue?

abbbi commented 1 year ago

may be, not using 32 bit systems at all. Maybe the issue occurs if both libvirt module and paramiko module are loaded in the same python session (would explain "device or resource busy", maybe some file handle in use):

try:

/usr/bin/python3
import libvirt
from paramiko import AutoAddPolicy, SSHClient
libvirt.open('qemu+ssh://root@10.10.100.7/system')
cli = SSHClient()
cli.load_system_host_keys()
cli.set_missing_host_key_policy(AutoAddPolicy())
cli.connect("10.10.100.7", username="root", timeout=5000)

otherwise try to remove the except clause here:

https://github.com/abbbi/virtnbdbackup/blob/master/libvirtnbdbackup/ssh/client.py#L72-L73

to make it print a full traceback.

abbbi commented 1 year ago

you may check out this branch: https://github.com/abbbi/virtnbdbackup/tree/issue92 Adds some log statement to provide full traceback of exception for easier analysis:

https://github.com/abbbi/virtnbdbackup/commit/956bf3c1e83f56ec1c4dd8fa54247bba19f15f16

imbmf commented 1 year ago

you may check out this branch: https://github.com/abbbi/virtnbdbackup/tree/issue92 Adds some log statement to provide full traceback of exception for easier analysis:

956bf3c

This is beyond my skills. :)

abbbi commented 1 year ago

git clone https://github.com/abbbi/virtnbdbackup -b issue92 cd virtnbdbackup/ ; ./virtnbdbackup ..

imbmf commented 1 year ago

root@mini-nas:~/virtnbdbackup# ./virtnbdbackup -v -U qemu+ssh://root@10.10.100.7/system -d debian11 -l full -o /ssd-raid/backups/ [2023-01-04 11:31:13] INFO common - printVersion [MainThread]: Version: 1.9.11 Arguments: ./virtnbdbackup -v -U qemu+ssh://root@10.10.100.7/system -d debian11 -l full -o /ssd-raid/backups/ [2023-01-04 11:31:13] INFO virtnbdbackup - main [MainThread]: Backup level: [full] [2023-01-04 11:31:13] DEBUG client - _connect [MainThread]: Libvirt URI: [qemu+ssh://root@10.10.100.7/system] [2023-01-04 11:31:13] DEBUG client - _connect [MainThread]: Login information specified, connect libvirtd using openAuth function. [2023-01-04 11:31:14] INFO client - _connect [MainThread]: Connected to remote host: [hv-02.kokal.local], local host: [mini-nas] [2023-01-04 11:31:14] INFO virtnbdbackup - main [MainThread]: Libvirt library version: [8010000] [2023-01-04 11:31:14] DEBUG client - getDomainDisks [MainThread]: Disk config file notation [2023-01-04 11:31:14] DEBUG client - getDomainInfo [MainThread]: No setting [loader] found: 'NoneType' object has no attribute 'text' [2023-01-04 11:31:14] DEBUG client - getDomainInfo [MainThread]: No setting [nvram] found: 'NoneType' object has no attribute 'text' [2023-01-04 11:31:14] DEBUG client - getDomainInfo [MainThread]: No setting [kernel] found: 'NoneType' object has no attribute 'text' [2023-01-04 11:31:14] DEBUG client - getDomainInfo [MainThread]: No setting [initrd] found: 'NoneType' object has no attribute 'text' [2023-01-04 11:31:14] DEBUG client - getDomainInfo [MainThread]: Domain Info: [{}] [2023-01-04 11:31:14] INFO virtnbdbackup - main [MainThread]: Domain has [1] disks attached which support changed block tracking. [2023-01-04 11:31:14] INFO virtnbdbackup - main [MainThread]: Concurrent backup processes: [1] [2023-01-04 11:31:14] INFO client - redefineCheckpoints [MainThread]: Loading checkpoint list from: [/ssd-raid/backups//checkpoints] [2023-01-04 11:31:14] INFO virtnbdbackup - handleCheckpoints [MainThread]: Checkpoint handling. [2023-01-04 11:31:14] DEBUG client - removeAllCheckpoints [MainThread]: Cleaning up persistent storage /ssd-raid/backups//checkpoints [2023-01-04 11:31:14] INFO virtnbdbackup - handleCheckpoints [MainThread]: Using checkpoint name: [virtnbdbackup.0]. [2023-01-04 11:31:14] DEBUG virtnbdbackup - handleCheckpoints [MainThread]: Checkpoint info: [{'name': 'virtnbdbackup.0', 'parent': '', 'file': '/ssd-raid/backups//debian11.cpt'}]. [2023-01-04 11:31:14] INFO client - connect [MainThread]: Connecting remote system via ssh, username: [root] [2023-01-04 11:31:14] ERROR client - connect [MainThread]: [Errno 16] Device or resource busy Traceback (most recent call last): File "/root/virtnbdbackup/libvirtnbdbackup/ssh/client.py", line 62, in connect cli.connect( File "/usr/lib/python3/dist-packages/paramiko/client.py", line 349, in connect to_try = list(self._families_and_addresses(hostname, port)) File "/usr/lib/python3/dist-packages/paramiko/client.py", line 203, in _families_and_addresses addrinfos = socket.getaddrinfo( File "/usr/lib/python3.10/socket.py", line 955, in getaddrinfo for res in _socket.getaddrinfo(host, port, family, type, proto, flags): OSError: [Errno 16] Device or resource busy [2023-01-04 11:31:14] WARNING common - sshSession [MainThread]: Failed to setup SSH connection: [Unknown exception occured: [Errno 16] Device or resource busy] [2023-01-04 11:31:14] ERROR virtnbdbackup - main [MainThread]: Remote backup detected but ssh session setup failed

abbbi commented 1 year ago

wow, it fails in getaddrinfo:

cli.connect(
File "/usr/lib/python3/dist-packages/paramiko/client.py", line 349, in connect
to_try = list(self._families_and_addresses(hostname, port))
File "/usr/lib/python3/dist-packages/paramiko/client.py", line 203, in _families_and_addresses
addrinfos = socket.getaddrinfo(
File "/usr/lib/python3.10/socket.py", line 955, in getaddrinfo
for res in _socket.getaddrinfo(host, port, family, type, proto, flags):
OSError: [Errno 16] Device or resource busy

this is not an issue in virtnbdbackup i could solve, its an problem with your underlying python/libc combination probably. Maybe cause is also that its a 32 bit python/system combination. you would have to look at the internet. Similar issues ive found:

https://github.com/matrix-org/synapse/issues/6190

abbbi commented 1 year ago

maybe its:

https://github.com/matrix-org/synapse/issues/6190#issuecomment-724912047

try "git pull" and execute again, it now logs the IP address too to see if there is any trailing whitespace or alike that causes the issue like suggested in this comment.

imbmf commented 1 year ago

root@mini-nas:~/virtnbdbackup# git pull remote: Enumerating objects: 5, done. remote: Counting objects: 100% (5/5), done. remote: Compressing objects: 100% (2/2), done. remote: Total 5 (delta 3), reused 5 (delta 3), pack-reused 0 Unpacking objects: 100% (5/5), 511 bytes | 28.00 KiB/s, done. From https://github.com/abbbi/virtnbdbackup 956bf3c..f093e4b issue92 -> origin/issue92 Updating 956bf3c..f093e4b Fast-forward libvirtnbdbackup/ssh/client.py | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-) root@mini-nas:~/virtnbdbackup# ./virtnbdbackup -v -U qemu+ssh://root@10.10.100.7/system -d debian11 -l full -o /ssd-raid/backups/ [2023-01-04 11:41:59] INFO common - printVersion [MainThread]: Version: 1.9.11 Arguments: ./virtnbdbackup -v -U qemu+ssh://root@10.10.100.7/system -d debian11 -l full -o /ssd-raid/backups/ [2023-01-04 11:42:00] INFO virtnbdbackup - main [MainThread]: Backup level: [full] [2023-01-04 11:42:00] DEBUG client - _connect [MainThread]: Libvirt URI: [qemu+ssh://root@10.10.100.7/system] [2023-01-04 11:42:00] DEBUG client - _connect [MainThread]: Login information specified, connect libvirtd using openAuth function. [2023-01-04 11:42:00] INFO client - _connect [MainThread]: Connected to remote host: [hv-02.kokal.local], local host: [mini-nas] [2023-01-04 11:42:00] INFO virtnbdbackup - main [MainThread]: Libvirt library version: [8010000] [2023-01-04 11:42:00] DEBUG client - getDomainDisks [MainThread]: Disk config file notation [2023-01-04 11:42:00] DEBUG client - getDomainInfo [MainThread]: No setting [loader] found: 'NoneType' object has no attribute 'text' [2023-01-04 11:42:00] DEBUG client - getDomainInfo [MainThread]: No setting [nvram] found: 'NoneType' object has no attribute 'text' [2023-01-04 11:42:00] DEBUG client - getDomainInfo [MainThread]: No setting [kernel] found: 'NoneType' object has no attribute 'text' [2023-01-04 11:42:00] DEBUG client - getDomainInfo [MainThread]: No setting [initrd] found: 'NoneType' object has no attribute 'text' [2023-01-04 11:42:00] DEBUG client - getDomainInfo [MainThread]: Domain Info: [{}] [2023-01-04 11:42:00] INFO virtnbdbackup - main [MainThread]: Domain has [1] disks attached which support changed block tracking. [2023-01-04 11:42:00] INFO virtnbdbackup - main [MainThread]: Concurrent backup processes: [1] [2023-01-04 11:42:00] INFO client - redefineCheckpoints [MainThread]: Loading checkpoint list from: [/ssd-raid/backups//checkpoints] [2023-01-04 11:42:00] INFO virtnbdbackup - handleCheckpoints [MainThread]: Checkpoint handling. [2023-01-04 11:42:00] DEBUG client - removeAllCheckpoints [MainThread]: Cleaning up persistent storage /ssd-raid/backups//checkpoints [2023-01-04 11:42:00] INFO virtnbdbackup - handleCheckpoints [MainThread]: Using checkpoint name: [virtnbdbackup.0]. [2023-01-04 11:42:00] DEBUG virtnbdbackup - handleCheckpoints [MainThread]: Checkpoint info: [{'name': 'virtnbdbackup.0', 'parent': '', 'file': '/ssd-raid/backups//debian11.cpt'}]. [2023-01-04 11:42:00] INFO client - connect [MainThread]: Connecting remote system [hv-02.kokal.local] via ssh, username: [root] [2023-01-04 11:42:00] ERROR client - connect [MainThread]: [Errno 16] Device or resource busy Traceback (most recent call last): File "/root/virtnbdbackup/libvirtnbdbackup/ssh/client.py", line 66, in connect cli.connect( File "/usr/lib/python3/dist-packages/paramiko/client.py", line 349, in connect to_try = list(self._families_and_addresses(hostname, port)) File "/usr/lib/python3/dist-packages/paramiko/client.py", line 203, in _families_and_addresses addrinfos = socket.getaddrinfo( File "/usr/lib/python3.10/socket.py", line 955, in getaddrinfo for res in _socket.getaddrinfo(host, port, family, type, proto, flags): OSError: [Errno 16] Device or resource busy [2023-01-04 11:42:00] WARNING common - sshSession [MainThread]: Failed to setup SSH connection: [Unknown exception occured: [Errno 16] Device or resource busy] [2023-01-04 11:42:00] ERROR virtnbdbackup - main [MainThread]: Remote backup detected but ssh session setup failed

abbbi commented 1 year ago

looks ok to me, nothing special here:

Connecting remote system [hv-02.kokal.local]

is your DNS setup on the system OK? Can you reverse resolve the hostname? Do you use hosts file only or is there an NSCD on the system? Is the system standard libc or musl or something else?

imbmf commented 1 year ago

You are right it was a DNS issue (shame on me). and after adding ndb service in firewall it looks like everything is working!