This issue is intended to serve as a coordination hub for RIA annex remote requirements, a description of implementation alternatives, and the selection of implementation options. (I am using "RIA annex remote" instead of "ORA" here to reduce the name space a little).

Requirements for the annex remote

The following lists contain the identified functional and non-functional requirements. Check-marked requirements apply. Un-checked requirements are identified but do not need to be fulfilled. Add new requirements by editing this issue and leaving a notification about the changes in the changelog.

Functional requirements

[x] Compatible with the RIA implementation in datalad core:
- [x] support archives
[x] Support side-channel git annex access on ria+ssh:-stores (git annex should be able to locally access the objects in the RIA store)
[x] Support side-channel git annex access on ria+file:-stores
[x] Support side-channel git annex access on ria+http:-stores
[x] Support for ria+ssh:
[x] Support for ria+file:
[ ] Support for ria+sftp: (from issue #100)
[x] Read-only support for ria+https:
[ ] Write support for ria+https:
[x] Support for POSIX-hosted RIA Stores
[ ] Support for Windows-hosted RIA Stores

Non-functional requirements

Correct (not negotiable)
Maintainable (not negotiable)
[x] Efficient

Implementation alternatives and status for the annex remote

IO abstraction vs multi-flavor RIA annex-remote implementation

In issue #99 we concluded that it is too restrictive to base the RIA annex-remote implemented on a file-system paradigm. It turned out that this abstraction layer is a logical bottleneck that works well for file-based access but does not translate easily to HTTP-based access. It is also unlikely to work for general object stores (it would require to extend the abstraction layer with object store-specific operations and switching between them in the higher-level implementation). See alse #30.

The chosen alternative is an implementation that uses object-store specific handler to implement the basic annex-remote operations, e.g. TRANSFER RETRIEVE, TRANSFER STORE, CHECKPRESENT, and DELETE.

This is currently done in PR #106. An abstract base class defines transfer_store, transfer_retrieve, checkpresent, and remove. ssh-, file-, and http-specific subclasses implement the abstract methods for the respective store.

Current choice: multi-flavor RIA annex-remote implementation

URL-operations vs. individual implementations

Generally, URL-operations map nicely onto annex remote-operations, e.g. TRANSFER RETRIEVE maps onto download. So it seems natural to completely rely on UrlOperations to implement the RIA annex remote (for supported URL-schemes). But issue #102 (atomicity) and issue #103 (ensure_writable) highlight that annex remotes might not be fully supported yet.

There might also be an efficiency issue, at least for SshUrlOperations. SshUrlOperations set up a new ssh-connection for each operation. Therefore PR #106 uses the new persistent shell from datalad_next.shell (which is not yet merged into the main branch of datalad-next). The persistent shell supports arbitrary shell commands, which allows for efficient implementations of atomicity and ensure_writable (it also allows the remote execution of scripts, which can improve the efficiency of complex operations like ensure writable).

Current choice: individual implementations, using UrlOperations and persistent shells

Requirements for `datalad create-sibling-ria`

The "datalad create-sibling-ria"-commands should move from datalad-core to datalad-ria. The commands use the io-abstraction. If we drop the io-abstraction (as argued above), the commands should probably be re-implemented to remove the io-abstraction layer.

[x] move datalad create-sibling-ria from datalad-core to datalad-ria
[ ] implement datalad create-sibling-ria without the io-abstraction. That means, base is on UrlOperations, datalad_next.shell and other existing mechanisms.

Changelog

2024-04-12: @christian-monch: created

datalad / datalad-ria

Coordination: RIA annex remote requirements, Implementation alternatives, and status #107

Requirements for the annex remote

Functional requirements

Non-functional requirements

Implementation alternatives and status for the annex remote

IO abstraction vs multi-flavor RIA annex-remote implementation

URL-operations vs. individual implementations

Requirements for `datalad create-sibling-ria`

Changelog

datalad / datalad-ria

Coordination: RIA annex remote requirements, Implementation alternatives, and status #107

Requirements for the annex remote

Functional requirements

Non-functional requirements

Implementation alternatives and status for the annex remote

IO abstraction vs multi-flavor RIA annex-remote implementation

URL-operations vs. individual implementations

Requirements for datalad create-sibling-ria

Changelog

Requirements for `datalad create-sibling-ria`