Core package and command line utility for E-ARK Information Package validation.
The validation core component implements validation rules defined by E-ARK specifications which can be found on the website of the Digital Information LifeCycle Interoperability Standards Board (DILCIS Board):
https://dilcis.eu/specifications/
Python 3.10 or later is required to run the E-ARK Python Information Package Validator.
You must be running either a Debian/Ubuntu Linux distribution or Windows Subsystem for Linux on Windows to follow these commands. If you are running a different Linux distribution you must change the apt commands to your package manager. For getting Windows Subsystem for Linux up and running, please follow the guide further down and then come back to this step.
It is recommended that you create a directory for your EARK work. Write the following:
mkdir EARK
To enter the directory use the following command
cd EARK/
To retrieve the source code from Github use the following command:
git clone https://github.com/E-ARK-Software/eark-validator.git
To enter the new directory containing the source code do:
cd eark-validator/
It is recommended that you create a virtual environment for Python. By doing that you avoid "polluting" the host operating system with dynamically fetched dependencies and at the same time it creates a reproducible environment for your validator.
To create a virtual environment we need to install virtualenv (not to be confused with the venv package). But we also need python3-pip to handle our Python packages. Install this by issuing the following command:
sudo apt install python3-pip
It will list a number of dependencies. Confirm that you wish to install python3-pip by pressing Y followed by ENTER
Now we can install the virtual environment with the following command:
sudo apt install python3-virtualenv
It will list a number of dependencies. Confirm that you wish to install python3-pip by pressing Y followed by ENTER
Finally we will need unzip. Install that by doing:
sudo apt install unzip
It will list a number of dependencies. Confirm that you wish to install python3-pip by pressing Y followed by ENTER
Set up a local virtual environment by issuing the following commands (one line at the time):
virtualenv -p python3 venv
source venv/bin/activate
Update pip to ensure you have the latest and install all the packages required:
pip install -U pip
pip install .
You are now able to run the application "eark-validator". It will validate an Information Package for you.
You can test a valid package by first retrieving it from the test corpus:
wget https://github.com/DILCISBoard/eark-ip-test-corpus/raw/integration/corpora/csip/metadata/metshdr/CSIP12/valid/mets-xml_metsHdr_agent_TYPE_exist.zip
Unzip the package:
unzip mets-xml_metsHdr_agent_TYPE_exist.zip
Delete the .zip-file you just downloaded:
rm mets-xml_metsHdr_agent_TYPE_exist.zip
Run the eark-validator:
eark-validator mets-xml_metsHdr_agent_TYPE_exist/
Result:
('Path mets-xml_metsHdr_agent_TYPE_exist/ is dir, struct result is: '
'StructureStatus.WellFormed')
If the path passed is a directory, it must contain a single folder which contains the information package (and no other files or folders):
user@machine:~$ tree input
<path to directory>
├── documentation
├── metadata
├── METS.ipxml
├── representations
│ └── rep1
│ ├── data
│ ├── metadata
│ └── METS.ipxml
└── schemas
If you do not have Linux and have not previously used WSL please perform the following steps. You must either be logged in as Administrator on the machine or as a user with Administrator rights on the machine.
Start a command prompt (cmd.exe) and then enter the following command:
wsl --install
Confirm that the app is allowed to make changes to your device. Installation begins.
Confirm once more that an app is allowed to make changes to your device.
Retrieving and installing the necessary components take a while. Please do not reboot or shutdown your computer during this process. Even if it seems stalled, it is working.
Installation concludes with the message: "The requested operation is successful. Changes will not be effective until the system is rebooted."
Please reboot your computer.
You will be prompted to create a new "UNIX username". By convention this is often a less than nine character long all-lowercase username. It does not need to match your Windows username.
You will be prompted to set a password.
You are now logged into Ubuntu (the default Linux distribution used by Windows Subsystem for Linux).
No matter how fresh the install, there will almost always be updates available. To fetch them write the following:
sudo apt update
And to install them:
sudo apt upgrade
Confirm that you wish to upgrade your packages by pressing Y followed by ENTER
Please resume the guide above.
Developers should install the testing dependencies as well, e.g. pytest
and using the --editable
flag:
pip install -U pip
pip install --editable ".[testing]"
You can run unit tests from the project root: pytest ./tests/
, or generate test coverage figures by: pytest --cov=eark_validator ./tests/
. If you want to see which parts of your code aren't tested then: pytest --cov=eark_validator --cov-report=html ./tests/
. After this you can open the file <projectRoot>/htmlcov/index.html
in your browser and survey the gory details.