datalad / datalad-ria

Adds functionality for RIA stores to DataLad
http://datalad.org
Other
0 stars 1 forks source link

Analysis of the RemoteIO framework #80

Open mih opened 1 year ago

mih commented 1 year ago

This is a closer look at the functionality provided by the IO framework used in the ora special remote, shipped with datalad-core. The aim is to determine what functionality is not immediately available from datalad-next's UrlOperations, and how the two approaches can be consolidated.

The IOBase class define a set of operations https://github.com/datalad/datalad/blob/776f465b6332bb6320d1b4dc45c85112ced1dd67/datalad/distributed/ora_remote.py#L109. Three (derived) classes (LocalIO, SSHRemoteIO, HTTPRemoteIO) implement these operations for particular environments/protocols. HTTPRemoteIO is not actually a derived class, and most operations are not implemented for HTTP.

The following list provides notes on the availability, and related/alternative implementations:

Mapping to UrlOperations

This framework defines the following operations. The list contains notes on which IOBase functionality could be mapped onto them

There is no equivalent for the "extract from archive" functionality. A more general implementation (via FSSPEC) was proposed (https://github.com/datalad/datalad-next/issues/210), but has not yet materialized. @christian-monch mentioned an implementation matching get_from_archive for HTTP, but it also has not been completed yet.

UrlOperations seems to implement progress reporting consistently, whereas IOBase and friends do not.

SSH-specific observation

SshUrlOperations used _SshCat, a different, simplistic helper to execute remote SSH command and read their output. It uses ThreadedRunner and exposes a stdin argument.