ReproNim / reproman

ReproMan (AKA NICEMAN, AKA ReproNim TRD3)
https://reproman.readthedocs.io
Other
24 stars 14 forks source link

reproman run --list to get all is awkward #598

Open asmacdo opened 1 year ago

asmacdo commented 1 year ago
$ reproman run --list
usage: reproman run [--version] [-h] [-l {critical,error,warning,info,debug,1,2,3,4,5,6,7,8,9}] [-m MESSAGE] [-r RESOURCE] [--resref-type TYPE]
                    [--list {submitters,orchestrators,parameters,}] [--submitter NAME] [--orchestrator NAME] [--batch-spec PATH] [--batch-parameter PATH]
                    [--job-spec PATH] [--job-parameter PARAM] [-i PATH] [-o PATH] [--follow [ACTION]]
                    ...
reproman run: error: argument --list: expected one argument

But empty quotes means all


(repronim-venv) vagrant@ubuntu2204:~/my-experiment$ reproman run --list ''
Submitters
  pbs
    Submit a PBS job.
  condor
    Submit a HTCondor job.
  slurm
    Submit a Slurm job.
  local
    Submit a local job.
  lsf
    Submit an LSF job.

Orchestrator
  plain
    Plain execution on remote directory.

    If no working directory is supplied via the `working_directory`
    job parameter, the remote directory is named with the job ID.
    Inputs are made available with a session.put(), and outputs are
    fetched with a session.get().

    Note: This orchestrator may be sufficient for simple tasks, but
    using one of the DataLad orchestrators is recommended.
  datalad-pair
    Execute command on remote dataset sibling.

    **Preparing the remote dataset** The default `working_directory`
    is the a directory named with dataset ID under `root_directory`.
    If the dataset doesn't exist, one is created, with a remote named
    after the resource.

    If the dataset already exists on the remote, the remote is
    updated, and the local commit is checked out on the remote. The
    orchestrator will check out a detached HEAD if needed. It won't
    proceed if the working tree is dirty and it won't advance a branch
    if it is checked out and the update is a fast-forward.

    To get inputs on the remote, a `datalad get` call is first tried
    to retrieve inputs from public sources. If that fails, a `datalad
    push ... INPUTS` call from the local dataset to the remote dataset
    is performed.

    **Fetching a completed job** `datalad update` is called to bring
    in the remote changes, along with a `datalad get` call to fetch
    the specified outputs. On completion, the HEAD on the remote will
    be a commit recording changes from the run. It is marked with a
    git ref: refs/reproman/JOBID.
  datalad-no-remote
    Execute a command in the current local dataset.

    Conceptually this behaves like datalad-pair. However, the working
    directory for execution is set to the local dataset. It is
    available for local shell resources only.
  datalad-pair-run
    Execute command in remote dataset sibling and capture results
    locally as run record.

    The remote is prepared as described for the datalad-pair
    orchestrator.

    **Fetching a completed job** After the job completes on the
    remote, the outputs are bundled into a tarball. (Outputs are
    identified based on file time stamps, not on the specified
    outputs.) This tarball is downloaded to the local machine and used
    to create a `datalad run` commit. The local commit will be marked
    with a git ref: refs/reproman/JOBID.
  datalad-local-run
    Execute command in a plain remote directory and capture results
    locally as run record.

    This orchestrator is useful when the remote resource does not have
    DataLad installed. The remote is prepared as described for the
    plain orchestrator. The fetch is performed as described for the
    datalad-pair-run orchestrator.

Job parameters
  root_directory
    The root run directory on the resource.

    By default, the working directory for a particular command is a
    subdirectory of this directory. Orchestrators can also use this
    root to store things outside of the working directory (e.g.
    artifacts used in the fetch).
  working_directory
    Directory in which to run the command.
  command_str, command
    Command to run (string and list form). A command will usually be
    set from the command line, but it can also be set in the job spec.
    If string and list forms are defined, the string form is used.
  submitter
    Name of submitter. The submitter controls how the command should
    be submitted on the resource (e.g., with `condor_submit`).
  orchestrator
    Name of orchestrator. The orchestrator performs pre- and post-
    command steps like setting up the directory for command execution
    and storing the results.
  batch_spec
    YAML file that defines a series of records with parameters for
    commands. A command will be constructed for each record, with
    record values available in the command as well as the inputs and
    outputs as `{p[KEY]}`.
  batch_parameters
    Define batch parameters with 'KEY=val1,val2,...'. Different keys
    can be specified by giving multiple values, in which case the
    product of the values are taken. For example, 'subj=mei,satsuki'
    and 'day=1,2' would expand to four records, pairing each subj with
    each day. Values can be a glob pattern to match against the
    current working directory.
  inputs, outputs
    Input and output files (list) to the command.
  message
    Message to use when saving the run. The details depend on the
    orchestator, but in general this message will be used in the
    commit message.
  container
    Container to use for execution. This should match the name of a
    container registered with the datalad-container extension. This
    option is valid only for DataLad run orchestrators.
  memory, num_processes
    Supported by Condor and PBS submitters.
  num_nodes, walltime
    Supported by PBS submitter.
  queue
    Supported by Slurm submitter.
  launcher
    If set to "true", the job will be run using Launcher, rather than
    as a job-array. See https://github.com/TACC/launcher for more
    info. Supported by Slurm and PBS submitters.