This is a closer look at the functionality provided by the IO framework used in the ora special remote, shipped with datalad-core. The aim is to determine what functionality is not immediately available from datalad-next's UrlOperations, and how the two approaches can be consolidated.
exists:
Returns a boolean indicating the presence of a file/directory given by a path.
get_from_archive [not-http]:
Extracts a file from a 7z archive and directly writes it through a pipe into a local target file.
in_archive [not-http]:
Returns a boolean indicating the presence of a file/directory inside a 7z archive given by a path.
read_file:
Read a remote file's content (all at once) and return it. This is pretty much get, without writing to a target file.
write_file [not-http]:
Write content to a remote file. Content is passed all at once to printf, so this likely only works for small files. This is pretty much put, without reading from a source file.
Mapping to UrlOperations
This framework defines the following operations. The list contains notes on which IOBase functionality could be mapped onto them
stat:
Can be used for exists, via a UrlOperationsResourceUnknown exception handling
download:
Implements get and read_file.
upload [not-http]:
Implements put and write_file. Handles mkdir implicitly.
delete [not-ssh, not-http]:
Implements remove
There is no equivalent for the "extract from archive" functionality. A more general implementation (via FSSPEC) was proposed (https://github.com/datalad/datalad-next/issues/210), but has not yet materialized. @christian-monch mentioned an implementation matching get_from_archive for HTTP, but it also has not been completed yet.
UrlOperations seems to implement progress reporting consistently, whereas IOBase and friends do not.
SSH-specific observation
SshUrlOperations used _SshCat, a different, simplistic helper to execute remote SSH command and read their output. It uses ThreadedRunner and exposes a stdin argument.
This is a closer look at the functionality provided by the
IO
framework used in theora
special remote, shipped with datalad-core. The aim is to determine what functionality is not immediately available from datalad-next'sUrlOperations
, and how the two approaches can be consolidated.The
IOBase
class define a set of operations https://github.com/datalad/datalad/blob/776f465b6332bb6320d1b4dc45c85112ced1dd67/datalad/distributed/ora_remote.py#L109. Three (derived) classes (LocalIO
,SSHRemoteIO
,HTTPRemoteIO
) implement these operations for particular environments/protocols.HTTPRemoteIO
is not actually a derived class, and most operations are not implemented for HTTP.The following list provides notes on the availability, and related/alternative implementations:
get_7z
[not-http]: Returns boolean, indicating the availability of the7z
command on the remote end.mkdir
[not-http] Creates a directory at a given path, including all parents and regardless of whether it already exists.symlink
[not-http]: Creates a symlink.put
[not-http]: Uploads a file.get
: Downloads a file.rename
[not-http]: Moves a file/directoryremove
[not-http]: Deletes an existing fileremove_dir
[not-http]: Deletes an empty directoryexists
: Returns a boolean indicating the presence of a file/directory given by a path.get_from_archive
[not-http]: Extracts a file from a 7z archive and directly writes it through a pipe into a local target file.in_archive
[not-http]: Returns a boolean indicating the presence of a file/directory inside a 7z archive given by a path.read_file
: Read a remote file's content (all at once) and return it. This is pretty muchget
, without writing to a target file.write_file
[not-http]: Write content to a remote file. Content is passed all at once toprintf
, so this likely only works for small files. This is pretty muchput
, without reading from a source file.Mapping to
UrlOperations
This framework defines the following operations. The list contains notes on which
IOBase
functionality could be mapped onto themstat
: Can be used forexists
, via aUrlOperationsResourceUnknown
exception handlingdownload
: Implementsget
andread_file
.upload
[not-http]: Implementsput
andwrite_file
. Handlesmkdir
implicitly.delete
[not-ssh, not-http]: Implementsremove
There is no equivalent for the "extract from archive" functionality. A more general implementation (via
FSSPEC
) was proposed (https://github.com/datalad/datalad-next/issues/210), but has not yet materialized. @christian-monch mentioned an implementation matchingget_from_archive
for HTTP, but it also has not been completed yet.UrlOperations
seems to implement progress reporting consistently, whereasIOBase
and friends do not.SSH-specific observation
SshUrlOperations
used_SshCat
, a different, simplistic helper to execute remote SSH command and read their output. It usesThreadedRunner
and exposes astdin
argument.