datalad / datalad-gooey

A graphical user interface for DataLad (datalad.org)
https://docs.datalad.org/projects/gooey
Other
4 stars 6 forks source link

Frontend for git-annex metadata #280

Closed mih closed 1 year ago

mih commented 2 years ago

There is no datalad command to support this (yet).

There are two essential modes of operation: (1) (mass-)setting metadata. This is what the annex metadata command or AnnexRepo.set_metadata(). And (2), editing a single metadata record.

There is a big difference between the two modes. (1) does not require knowledge about the state of a metadata record. This operation can be performed like any other command execution: parameterize and run (on a set of files). (2) requires loading a metadata record first.

There might be a way to make the two modes similar enough to be able to support both of them within the current command execution paradigm. An annex-metadata command could have a --seed parameter (name TBD) that takes a path to an annexed file to load an initial metadata record from. This initial record then becomes a single, explicit starting point for the for the operations offered by annex metadata (=, +=, ?=, -=, purge), rather than each file's state having its own. This record is then prepopulating the value for the "set" parameter.

So when ran on an individual file, the parameter form could initialize itself from a single file (which could be a dedicated --seed, or just be the only given path).

mih commented 2 years ago

Here is the first concept of a command for metadata manipulation I arrived at. There is a second one, I will post below.

Tentative name: meta-annex (could live in -next or -metalad).

The API is adopted from annex metadata:

meta-annex [ --get field ] <path> [<path> ...]

Report annex metadata for a particular path, optionally limited to a particular field. Otherwise the default is to report all metadata. One path must be given, more are optional. If <path> is a directory all content will be reported on.

meta-annex --set <field><=|+=|?=|-=|!=>[<value>] [--set ...] <path> [<path> ...]

Set metadata for one or more paths. Each --set identifies a metadata field to set, followed by an operator, and (mostly) a value. Operations are inspired by git-annex but amended:

Any number of --set can be given.

It would be straightforward to support annex metadata's --key later on, too.

mih commented 2 years ago

The second concept is based on the existing set|get_metadata() methods in AnnexRepo.

When called with no "setter options", all metadata on record is reported. Users can use standard result rendering features to pick individual fields to write out.

Instead of one "setter" --set, there are multiple that can each be given multiple times:

For this concept, and also the one above, there would be an additional option take_from | --take-from that identifies an annexed file that provides a metadata record to serve as a starting point for further incremental metadata operations. That record would be translated into a parameterization of --set <field>=<value, and any necessary --set <field>+=<value> or add <field> <value> specifications. Any additional specifications coming in via command options, are sorted after these initial ones.

The would make meta-annex --take from <path> --set != <nonexisting field> <path> and idempotent operation, ie. no effective change in the metadata records for the file at <path>.

Setting this particular option in Gooey can be used to trigger load metadata from <path> in order to populate the parameter form of meta-annex for convenient editing.

mih commented 1 year ago

I fear that ATM I cannot come up with a way to have a commandline front-end that has an API which also renders to a suitable GUI. I will likely go for a custom editor.

I am leaving a note on two development trajectories that might nevertheless bring an automatically rendered input form: