Open horazont opened 5 years ago
Are there other tools which take FDs as an option and how is their cli option named?
Guess it wouldn't be too hard to implement that, but it also needs to be tested, so how can one test this?
Are there other tools which take FDs as an option and how is their cli option named?
I could’ve sworn I saw that, but I haven’t been able to find any example. Tools seem to prefer environment variables for that for some reason (which would be fine by me).
NOTIFY_SOCKET
environment variable used by systemd for Type=notify
service to call back when they’re done starting up.XSS_SLEEP_LOCK_FD
is used by xss-lock
as a signalling channel when locking the screen has completed.I think it might make more sense to allow passing a UNIX socket instead of a pair of file descriptors, maybe?
Guess it wouldn't be too hard to implement that, but it also needs to be tested, so how can one test this?
I have looked a slight bit into the code, and I saw that for testing, you fall back to invoking python with -m borg.archiver
instead of invoking borg serve remotely.
In a test, you could let the FD option take precedence over that and spawn a borg serve e.g. using socat EXEC
on the side. Although the downside is that the process lifecycle of the borg serve
needs to be managed by the test tool. (And, of course, it is not possible to pass options to the borg serve process using this mechanism, which needs to be documented properly.)
Connecting to a UNIX socket file or connecting to a localhost TCP port would be nicer than file descriptors for most use cases I can think of. Socket files are a bit annoying in that they need to be unlinked before usage etc and ssh is not reliably unlinking socket files (at least for me). But there could be the option of doing both.
Syntax could look something like this:
borg create borg:///run/borg/borg-serve.socket:/path/to/repo::/archive_name
borg create borg://localhost:1030:/path/to/repo::/archive_name
The reasoning behing calling the protocol borg would be because the socket would be connected to a borg serve
process.
To complement this borg serve
could have the option --listen
(or something similar)
borg serve --listen /run/borg/borg-serve.socket
borg serve --listen localhost:1030
This behavior would make pull-style backups with borg serve running via ssh reverse tunnel as easy as this. Example to backup up host potato
:
borg serve --listen /run/borg/potato.socket &
ssh -o "StreamLocalBindUnlink yes" \
-R /run/borg/borg-serve.socket:/run/borg/potato.socket potato \
borg create borg:///run/borg/borg-serve.socket:/backups/potato::new folder1 folder2
or with TCP sockets:
borg serve --listen 127.0.0.1:5060 &
ssh -R 1030:127.0.0.1:5060 potato \
borg create borg://127.0.0.1:1030:/backups/potato::new folder1 folder2
One could even create an alias/wrapper to all of this. I don't know whether this would still be in the scope of borg:
borg pull potato /backups/potato::new folder1 folder2
could start borg serve on a temporary socket/port, start ssh remote port forwarding on a temporary port and call borg create on the remote host with the appropriate options, wait for the backup to complete and stop borg serve again. This might look like out of scope but borg can already backup to remote ssh repositories and call borg serve there. This means it already has the capabilities to call ssh. This would in theory just be the "reverse way" of doing borg create. IMHO this would simplify things a lot.
@felinira that sounds pretty interesting, but what I don't like about it is the complex syntax for the REPO_ARCHIVE argument. Users would need to cope with that and our code also.
I mean especially the socket file variant (the tcp variant is I guess easier to get right):
borg:///run/borg/borg-serve.socket:/path/to/repo::/archive_name
IMHO it was a mistake to ever have repo and archive name mixed together in one argument (see the complex / error prone parsing code we have for that).
Your syntax suggestion would make this even way more complex than it already is and there would be quite some probability of introducing new parsing issues (and new usability issues on the user side).
Yeah, I don't like it either. Of course you could add something like --connect
but that would make it inconsistent with ssh://
. One could of course add a third alternate syntax to ssh to break it all up.
borg create --connect ssh://backupsrv /path/to/repo/on/backupsrv::archivename
borg create --connect borg://localhost:5060 /path/to/repo/on/backupsrv::archivename
I don't really know if this is better.
Maybe this is something to postpone for an eventual 2.0 version where #948 is also addressed? I can think of two variants of how to support this in that case.
In that case, there could be a --repository
argument which supports URIs:
file:///path/to/local/repository
(with optional file://
)ssh://user@server:22/path/to/remote/repository
like current ssh://
borg:///path/to/socket
for UNIX socketsborg://server:port
for TCP socketsWhat obviously is missing is that there’s no way to select the path to the repository for the borg://
protocol. I think in the pull-style scenarios, you’ll want to predetermine the repository on the puller side of things anyways (using --restrict-to-repository
). If the borg serve
process could be asked which repository it exports, the borg client (running on the pullee) could simply ask the borg server for the path instead of having it set in the URL.
If one wants to support multiple repositories, I think the only way which doesn’t lead to madness (and which is supported with urllib
etc.) would be to pass the path to the repository via a query argument (i.e. borg:///path/to/socket?repo=/path/to/repo
). That makes it inconsistent with ssh://
and file://
, but otherwise parsing will be a PITA.
Have two arguments, --connect
and --repository
. This is very similar what @felinira just proposed.
--connect
accepts a URI:
ssh://user@server:22
-> spawns borg serve
over SSHborg:///path/to/socket
-> UNIXborg://server:1234
-> TCP--repository
then determines the path to the repository. Note that in this case, ssh://
can not be used with a path (for consistency). In that case, the ssh://
path could be used to point to the borg executable (like: ssh://user@server:22/usr/bin/borg
).
An archive is then addressed via three parameters:
--connect
: Specifies how to reach the borg server. Defaults to a value indicating a local server.--repository
: Specifies where to find the repository on the borg server. If absent and using borg://
, we can steal the idea from Variant 1 to let the borg server expose a repository path to the client if it was started with --restrict-to-repository
.--archive
(more likely to be positional): The name of the archive.I find Variant 2 much more consistent and prefer it.
I prefer Version 2 too. One could add an option like --pull-host <ssh host>
which would then replace --connect
, run borg serve --restrict-to-repository
on a local socket, do ssh remote port forwarding and specify the correct --connect
options for borg create
on the remote end to connect to the socket. Or have a separate command, not sure. This would make it both very simple to implement a pull style backup (install borg on both machines, run one command, don't worry about lifecycle management of borg serve
processes) and flexible (you can specify ports and socket files if you want and implement the connection between these sockets yourself).
The only issue with all these things is security. There needs to be an explicit security warning that you shall not expose borg serve
directly to the network. It should be obvious and it's already possible if you really want to but this makes it way easier. Maybe the code should explicitly disallow listening on non-local sockets as the connection isn't even authenticated let alone encrypted.
Your observations sell me even more on Variant 2.
Of course, this means that this needs to be postponed until a breaking release. But since a viable workaround (shellscript + --rsh) exists, I don’t think that’s too bad.
You can even work around it without shell script and --rsh="bash -c \"exec socat [...]\""
.
Are there any plans for such a breaking release? Glancing over the issue list I can see quite a few issues tagged breaking
. But I understand if this is not really a good enough reason... breaking all wrapper scripts and cron jobs for everyone definitely isn't something to do lightheartedly.
You can even work around it without shell script and
--rsh="bash -c \"exec socat [...]\""
.
! I did not realize that until now. Thanks. That’ll make things much nicer. Doesn’t even need bash, sh will do.
See #7615 (implement sockets) and #7618 (overlaps with this one).
Have you checked borgbackup docs, FAQ, and open Github issues?
Yes
Is this a BUG / ISSUE report or a QUESTION?
Feature request, I guess?
System information. For client/server mode post info for both machines.
Your borg version (borg -V).
borg 1.1.10
Operating system (distribution) and version.
Debian GNU/Linux bullseye/sid
Hardware / network configuration, and filesystems used.
N/A
How much data is handled by borg?
N/A
Describe the problem you're observing.
I am writing a few things to document and ease pull-style operation (#900). During this, I came across the issue that I somehow need to pass a socket or a pair of file descriptors to borg instead of a rsh command.
Of course, I can emulate that using a shell script like this as RSH:
However, it would be much easier if we could simply pass file descriptor numbers to
--rsh
on platforms which support that. Alternatively, pass a path to a unix socket to connect to.