This DataLad extension can be thought of as a staging area for additional functionality, or for improved performance and user experience. Unlike other topical or more experimental extensions, the focus here is on functionality with broad applicability. This extension is a suitable dependency for other software packages that intend to build on this improved set of functionality.
# create and enter a new virtual environment (optional)
$ virtualenv --python=python3 ~/env/dl-next
$ . ~/env/dl-next/bin/activate
# install from PyPi
$ python -m pip install datalad-next
Additional commands provided by this extension are immediately available after installation. However, in order to fully benefit from all improvements, the extension has to be enabled for auto-loading by executing:
git config --global --add datalad.extensions.load next
Doing so will enable the extension to also alter the behavior the core DataLad package and its commands.
credentials
command to set, remove, and query credentials.create-sibling-...
commands for the platforms GitHub, GIN, GOGS, Gitea
are equipped with improved credential handling that, for example, only stores
entered credentials after they were confirmed to work, or auto-selects the
most recently used, matching credentials, when none are specified.create-sibling-webdav
command for hosting datasets on a WebDAV server via
a sibling tandem for Git history and file storage. Datasets hosted on WebDAV
in this fashion are cloneable with datalad-clone
. A full annex setup
for storing complete datasets with historical file content version, and an
additional mode for depositing single-version dataset snapshot are supported.
The latter enables convenient collaboration with audiences that are not using
DataLad, because all files are browsable via a WebDAV server's point-and-click
user interface.datalad-push
to automatically export files to git-annex special
remotes configured with exporttree=yes
.datalad-push
when processing non-git special remotes. This particularly
benefits less efficient hosting scenarios like WebDAV.datalad-siblings enable
(AnnexRepo.enable_remote()
) to automatically
deploy credentials for git-annex special remotes that require them.git-remote-datalad-annex
is a Git remote helper to push/fetch to any
location accessible by any git-annex special remote.git-annex-backend-XDLRA
(originally available from the mihextras
extension)
is a custom external git-annex backend used by git-remote-datalad-annex
. A base
class to facilitate development of external backends in Python is also provided.datalad-configuration
to support getting configuration from "global"
scope without a dataset being present.http(s)
, ssh
, and file
URLs, and can be extended with custom functionality
for additional protocols or even interaction with specific individual servers.
The basic operations download
, upload
, delete
, and stat
are recognized,
and can be implemented. The framework offers uniform progress reporting and
simultaneous content has computation. This framework is meant to replace and
extend the downloader/provide framework in the DataLad core package. In contrast
to its predecessor it is integrated with the new credential framework, and
operations beyond downloading.git-annex-remote-uncurl
is a special remote that exposes the new URL
operations framework via git-annex. It provides flexible means to compose
and rewrite URLs (e.g., to compensate for storage infrastructure changes)
without having to modify individual URLs recorded in datasets. It enables
seamless transitions between any services and protocols supported by the
framework. This special remote can replace the datalad
special remote
provided by the DataLad core package.download
command is provided as a front-end for the new modular URL
operations framework.python-requests
compatible authentication handler (DataladAuth
) that
interfaces DataLad's credential system.runner
component for command execution.constraints
system
for type conversion and parameter validation.next-status
command that is A LOT faster than status
, and offers
a mono
recursion mode that shows modifications of nested dataset
hierarchies relative to the state of the root dataset.
Requires Git v2.31 (or later).commands
,
annexremotes
, datasets
(etc) are collected in topical top-level modules that
provide "all" necessary pieces in a single place.webdav_server
fixture that automatically deploys a local WebDAV
server.probe_url()
discovers redirects and authentication requirements for an HTTP
URLget_auth_realm()
returns a label for an authentication realm that can be used
to query for matching credentialsget_specialremote_credential_properties()
inspects a special remote and returns
properties for querying a credential store for matching credentialsupdate_specialremote_credential()
updates a credential in a store after
successful useget_specialremote_credential_envpatch()
returns a suitable environment "patch"
from a credential for a particular special remote typedatalad_next.utils.patch
)git-annex
backends.pytest
fixtures to:
iter_subproc()
helper that enable communication with subprocesses
via input/output iterables.shell
context manager that enables interaction with (remote) shells,
including support for input/output iterables for each shell-command execution
within the context.Some of the features described above rely on a modification of the DataLad core
package itself, rather than coming in the form of additional commands. Loading
this extension causes a range of patches to be applied to the datalad
package
to enable them. A comprehensive description of the current set of patch is
available at http://docs.datalad.org/projects/next/en/latest/#datalad-patches
This extension package moves fast in comparison to the core package. Nevertheless, attention is paid to API stability, adequate semantic versioning, and informative changelogs.
Anything that can be imported directly from any of the sub-packages in
datalad_next
is considered to be part of the public API. Changes to this API
determine the versioning, and development is done with the aim to keep this API
as stable as possible. This includes signatures and return value behavior.
As an example: from datalad_next.runners import iter_git_subproc
imports a
part of the public API, but from datalad_next.runners.git import iter_git_subproc
does not.
Developers can obviously use parts of the non-public API. However, this should only be done with the understanding that these components may change from one release to another, with no guarantee of transition periods, deprecation warnings, etc.
Developers are advised to never reuse any components with names starting with
_
(underscore). Their use should be limited to their individual subpackage.
This DataLad extension was developed with funding from the Deutsche Forschungsgemeinschaft (DFG, German Research Foundation) under grant SFB 1451 (431549029, INF project).