artefactual / automation-tools

Tools to aid automation of Archivematica and AtoM.
GNU Affero General Public License v3.0
46 stars 33 forks source link

Add DIP upload to SS script #109

Closed jraddaoui closed 5 years ago

jraddaoui commented 5 years ago

Uploads a local DIP to a DIP storage location in an SS instance. Requires access to a pipeline's currently processing location path (the shared path), to move the DIP folder in there and send a requests to the Storage Service to process that DIP and create a relationship with the AIP from where it was created.

Add METS type argument to create_dip script to choose between a generated METS file that matches AtoM requirements for DIP upload and copying the AIP's METS file directly, like the Storage Service does, to be able to move the DIP to the SS. Defaults to AtoM METS for backwards compatibility.

Allow to upload DIPs after creation in create_dips_job. Accept different subsets of parameters and optionally upload the created DIPs to AtoM or the Storage Service. Extend this script to make use of its tracking functionality to create DIPs from all AIPs in a SS location. Optionally delete the created DIPs after upload.

Use AIP folder name for DIP folder on creation to match what's currently being done for DIPs stored in the SS.

Connects to https://github.com/archivematica/Issues/issues/688.

All new parameters are optional, so this should not break any existing setup, in the case of a repository upgrade and not changing how the commands are executed. The only noticeable change should be the DIP folder name in those cases. I'll extend the existing docs from the README on a different PR.

jraddaoui commented 5 years ago

The create_dips_job parameters have grown a lot to allow the upload to the SS or AtoM from within the same script. However, this is an script that is intended to be executed periodically, so it may not be a huge issue after the first setup. Also, the help from the command is not perfect but is good enough:

create_dips_job.py -h:

usage: create_dips_job.py [-h] [--ss-url URL] --ss-user USERNAME --ss-api-key
                          KEY --location-uuid UUID --database-file PATH
                          [--tmp-dir PATH] [--output-dir PATH]
                          [--log-file FILE] [--verbose] [--quiet]
                          [--log-level {ERROR,WARNING,INFO,DEBUG}]
                          [--delete-local-copy]
                          {ss-upload,atom-upload} ...

Create DIPs from an SS location

Get all AIPs from an existing SS instance, filtering them by location,
creating DIPs using the `create_dip` script and keeping track of them
in an SQLite database.

Optionally, uploads those DIPs to AtoM or the Storage Service using
the scripts from `dips` and deletes the local copy.

optional arguments:
  -h, --help            show this help message and exit
  --ss-url URL          Storage Service URL. Default: http://127.0.0.1:8000
  --ss-user USERNAME    Username of the Storage Service user to authenticate
                        as.
  --ss-api-key KEY      API key of the Storage Service user.
  --location-uuid UUID  UUID of an AIP Storage location in the Storage
                        Service.
  --database-file PATH  Absolute path to an SQLite database file.
  --tmp-dir PATH        Absolute path to the directory used for temporary
                        files. Default: /tmp.
  --output-dir PATH     Absolute path to the directory used to place the final
                        DIP. Default: /tmp.
  --log-file FILE       Location of log file
  --verbose, -v         Increase the debugging output.
  --quiet, -q           Decrease the debugging output
  --log-level {ERROR,WARNING,INFO,DEBUG}
                        Set the debugging output level. This will override -q
                        and -v
  --delete-local-copy   Deletes the local DIPs after upload if any of the
                        upload arguments is used.

Upload options:
  The following arguments allow to upload the DIP after creation:

  {ss-upload,atom-upload}
                        Leave empty to keep the DIP in the output path.
    ss-upload           Storage Service upload. Check 'create_dips_job ss-
                        upload -h'.
    atom-upload         AtoM upload. Check 'create_dips_job atom-upload -h'.

create_dips_job.py ss-upload -h:

usage: create_dips_job.py ss-upload [-h] --pipeline-uuid UUID
                                    --cp-location-uuid UUID --ds-location-uuid
                                    UUID [--shared-directory PATH]

optional arguments:
  -h, --help            show this help message and exit
  --pipeline-uuid UUID  UUID of the Archivemativa pipeline in the Storage
                        Service
  --cp-location-uuid UUID
                        UUID of the pipeline's Currently Processing location
                        in the Storage Service
  --ds-location-uuid UUID
                        UUID of the pipeline's DIP storage location in the
                        Storage Service
  --shared-directory PATH
                        Absolute path to the pipeline's shared directory.

create_dips_job.py atom-upload -h:

usage: create_dips_job.py atom-upload [-h] [--atom-url URL] --atom-email EMAIL
                                      --atom-password PASSWORD --atom-slug
                                      SLUG [--rsync-target HOST:PATH]

optional arguments:
  -h, --help            show this help message and exit
  --atom-url URL        AtoM instance URL. Default: http://192.168.168.193
  --atom-email EMAIL    Email of the AtoM user to authenticate as.
  --atom-password PASSWORD
                        Password of the AtoM user.
  --atom-slug SLUG      AtoM archival description slug to target the upload.
  --rsync-target HOST:PATH
                        Destination value passed to Rsync. Default:
                        192.168.168.193:/tmp.
jraddaoui commented 5 years ago

Hi @sevein, back to you with a related issue, sorry for not doing that earlier.