This software tool is designed to enable the curatorial review of datasets that are deposited into the University of Arizona Research Data Repository (ReDATA). It follows a workflow that was developed by members of the Research Data Services Team at the University of Arizona Libraries. The software has a number of backend features, such as:
These backend services ingest the datasets and accompanying files (described above) onto a curatorial "staging" server with attached storage to enable the full curatorial review procedure.
Although not available yet, a web application will serve as the front-end framework to allow for easy navigation through the curatorial review. Also, integration with the Trello REST API is another feature to further assist with the curatorial review process.
These instructions will have the code running on your local or virtual machine.
You will need the following to have a working copy of this software. See installation steps: Note: some of the dependencies will be updated.
figshare
- ReDATA's forked copy of cognoma's figsharepandas
(1.2.3)requests
(2.22.0)numpy
(1.20.0)jinja2
(2.11.2)tabulate
(0.8.3)html2text
(2020.1.16)conda
environmentFirst, install a working version of Python (>=3.7.9). We recommend using the Anaconda package installer.
After you have Anaconda installed, you will want to create a separate conda
environment
and activate it:
$ (sudo) conda create -n curation python=3.7
$ conda activate curation
With the activated conda
environment, next clone the
UA Libraries' forked copy of figshare
and install with the setup.py
script:
(curation) $ cd /path/to/parent/folder
(curation) $ git clone https://github.com/UAL-RE/figshare.git
(curation) $ cd /path/to/parent/folder/figshare
(curation) $ (sudo) python setup.py develop
Then, clone this repository (LD-Cool-P
) into the parent folder and install with the setup.py
script:
(curation) $ cd /path/to/parent/folder
(curation) $ git clone https://github.com/UAL-RE/LD-Cool-P.git
(curation) $ cd /path/to/parent/folder/LD-Cool-P
(curation) $ (sudo) python setup.py develop
This will automatically installed the required pandas
, requests
, numpy
,
jinja2
, tabulate
, and html2text
packages.
You can confirm installation via conda list
(curation) $ conda list ldcoolp
You should see that the version is 1.2.0
.
Configuration settings are specified through the --config
flag in the scripts
described below. For example:
--config ldcoolp/config/myconfig.ini
Note that in the init.py, there's a default setting:
config_dir = path.join(co_path, 'config/')
main_config_file = 'default.ini'
config_file = path.join(config_dir, main_config_file)
This is used when a configuration file is not provided in all modules and functions that require settings.
A template for this configuration file is provided.
There are a number of config sections, including figshare
, curation
, and qualtrics
.
The most important settings to define are those populated with ***override***
.
Additional settings to change are figshare
stage
flag, and curation
source
.
Since the configuration settings will continue to evolve, we refer users to the
documented information provided.
These configurations are read in through the config
sub-package.
This section is under construction
There are or will be a number of ways to execute the software.
There are two ways to execute the software using the command-line. The first is to use ipython/python:
article_id = 13456789
from ldcoolp.curation import main
main.workflow(article_id)
Here the article_id
is the unique ID that Figshare provides for any article.
The above script will perform the prerequisite steps of:
1.ToDo
to the
2.UnderReview
Another command-line approach is using the python script called prereq_script
:
(curation) $ ./ldcoolp/scripts/prereq_script \
--config ldcoolp/config/default.ini --article_id 12345678
Additional python scripts are available to
Retrieve the list of pending curation and their article_id
:
(curation) $ ./ldcoolp/scripts/get_curation_list \
--config ldcoolp/config/default.ini
Retrieve the Qualtrics URLs to provide to an author/depositor:
(curation) $ ./ldcoolp/scripts/generate_qualtrics_link \
--config ldcoolp/config/default.ini --article_id 12345678
Update the README.txt file for changes to metadata information:
(curation) $ ./ldcoolp/scripts/update_readme \
--config ldcoolp/config/default.ini --article_id 12345678
Move between curation stages (either next
, back
, or to publish
):
(curation) $ ./ldcoolp/scripts/perform_move --direction next \
--config ldcoolp/config/default.ini --article_id 12345678
(curation) $ ./ldcoolp/scripts/perform_move --direction back \
--config ldcoolp/config/default.ini --article_id 12345678
(curation) $ ./ldcoolp/scripts/perform_move --direction publish \
--config ldcoolp/config/default.ini --article_id 12345678
We use SemVer for versioning. For the versions available, see the tags on this repository.
Releases are auto-generated using this GitHub Actions script
following a git tag
version.
See the CHANGELOG for all changes since project inception
See also the list of contributors who participated in this project.
This project is licensed under the MIT License - see the LICENSE file for details.