archivematica-src

Archivematica installation from its source code repositories.

Table of Contents

Role Variables
Environment variables
Database requirements
Backward-compatible logging
Configure ClamAV
Disable Elasticsearch use
Deploy separate SS and pipeline
Tags
Dependencies
Example Playbooks
License
Author Information

Role Variables

See defaults/main.yml for a comprehensive list of variables.

Environment variables

The following are role variables that can be used to pass dictionaries containing environment variables that will be passed to the different Archivematica components:

Dashboard: archivematica_src_am_dashboard_environment
MCPServer: archivematica_src_am_mcpserver_environment
MCPClient: archivematica_src_am_mcpclient_environment
Storage Service: archivematica_src_ss_environment

The default values for these dictionaries can be found in vars/envs.yml. The user-provided dictionaries are combined with the defaults.

The authoritative place to find accurate information about the environment variables supported is the install/README.md file for each Archivematica component. Use the links above to find them.

In the following example we're going to redefine the environment variables of the MCPServer.

---
archivematica_src_am_mcpserver_environment:
  ARCHIVEMATICA_MCPSERVER_MCPSERVER_SHAREDDIRECTORY: "/tmp/shared-directory"

The final environment generated by this role is:

{
  "DJANGO_SETTINGS_MODULE": "settings.common",
  "ARCHIVEMATICA_MCPSERVER_MCPSERVER_SHAREDDIRECTORY": "/tmp/shared-directory"
}

There are a number of role variables that have an effect in the final environment dictionaries, e.g. when archivematica_src_ca_custom_bundle is used the environment string REQUESTS_CA_BUNDLE is added to all the environment dictionaries. This is mostly done for backward-compatibility reasons or for convenience. See tasks/envs-patch-backward-compatibility.yml for a full list of variables available and its effects.

Database requirements

The role expects a MySQL server configured with two databases for the MCP and the Storage Service. The easiest way is using the ansible-percona role in combination with this role.

This is an example of the database settings for both roles, it creates and configures the SS and MCP databases with the ansible-percona role using the Archivematica role variables:

---
# Archivematica role

# Database settings
archivematica_src_am_db_host: "localhost"        # Archivematica database host
archivematica_src_am_db_name: "MCP"              # Archivematica database name
archivematica_src_am_db_user: "archivematica"    # Archivematica database user
archivematica_src_am_db_password: "demo"         # Archivematica database password

archivematica_src_ss_db_mysql_enabled: "true"    # Use MySQL SS database. Set as false for sqlite3
archivematica_src_ss_db_host: "localhost"        # Archivematica Storage Service database host
archivematica_src_ss_db_name: "SS"               # Archivematica Storage Service database name
archivematica_src_ss_db_user: "archivematica"    # Archivematica Storage Service database user
archivematica_src_ss_db_password: "demo"         # Archivematica Storage Service database password
archivematica_src_ss_db_port: 3306               # Archivematica Storage Service database password

# Percona role

mysql_version_major: "5"
mysql_version_minor: "7"

mysql_databases:
  - name: "{{ archivematica_src_am_db_name }}"
    collation: "utf8_general_ci"
    encoding: "utf8"
  - name: "{{ archivematica_src_ss_db_name }}"
    collation: "utf8_general_ci"
    encoding: "utf8"
mysql_users:
  - name: "{{ archivematica_src_am_db_user }}"
    pass: "{{ archivematica_src_am_db_password }}"
    priv: "{{ archivematica_src_am_db_name }}.*:ALL,GRANT"
    host: "{{ archivematica_src_am_db_host }}"
  - name: "{{ archivematica_src_ss_db_user }}"
    pass: "{{ archivematica_src_ss_db_password }}"
    priv: "{{ archivematica_src_ss_db_name }}.*:ALL,GRANT"
    host: "{{ archivematica_src_ss_db_host }}"

In case you want to use the legacy sqlite3 database on the Archivematica Storage Service, just disable the archivematica_src_ss_db_mysql_enabled variable.

Migration to MySQL in Storage Service

Since Archivematica 1.13, MySQL is the default database engine in both Archivematica and Storage Service. It is possible that your Storage Service deployment is still using SQLite. If that's the case, please know that:

With archivematica_src_ss_db_mysql_enabled (default: true), this role configures the environment to ensure that MySQL is used. The connection parameters are better configured via the archivematica_src_ss_db_* variables.
With archivematica_src_migrate_sqlite3_enabled (default: false), this role will perform the migration automatically. Don't use this facility unless you have previously backed up the database. The location of the SQLite database may be indicated via archivematica_src_migrate_sqlite3_db_name, but the default value will likely work for you.

Backward-compatible logging

The default Archivematica 1.7 logging sends the events to the standard streams, which is more convenient when Archivematica is running in a cluster. In order to use the backward-compatible logging, the boolean environment variable archivematica_src_logging_backward_compatible has to be enabled, which is the default behaviour.

The log file sizes and the directories to store the logs are configurable for each service. The default values can be found in defaults/main.yml.

Configure ClamAV

This role will try to determine whether the ClamAV daemon is running using the TCP or UNIX socket on the same server that the pipeline is being installed or updated on. To configure an external ClamAV daemon server, the following env vars should be set:

---
archivematica_src_mcpclient_clamav_use_tcp: "yes"
archivematica_src_mcpclient_clamav_tcp_ip: "1.2.3.4"
archivematica_src_mcpclient_clamav_tcp_port: "3310"

Disable Elasticsearch use

The default Archivematica install relies on Elasticsearch for different features (Archival storage, Backlog and Appraisal tabs). If you need to disable them, the role variable archivematica_src_search_enabled has to be set to False

Deploy separate SS and pipeline

To deploy a separate Storage Service and pipeline, configure the following variable and dictionary (see examples in defaults/main.yml):

archivematica_src_remote_pipeline: The FQDN or IP address of the pipeline.
archivematica_src_remote_locations: A dictionary containing the locations to be enabled within the pipeline local filesystem space.

This is the procedure for deploying separate SS and pipeline VMs:

Deploy the Storage Service (SS) VM: Run the playbook, disabling the amsrc-remote-pipeline tag. For example:
```
ansible-playbook am-ss.yml -t archivematica-src --skip-tags=amsrc-remote-pipeline -l my_SS_in_inventory
```
Deploy the Pipeline VM: Register the pipeline in the SS. No additional steps are required if using the amsrc-configure variables to create users and register the pipeline.
Finalize Configuration on the SS VM: After both the SS and pipeline are deployed and the pipeline is registered, rerun the role on the SS VM with the amsrc-remote-pipeline tag enabled. For example:
```
ansible-playbook am-ss.yml -t amsrc-remote-pipeline -l my_SS_in_inventory
```

NOTE: It is possible to deploy and configure more than one pipeline. For instance, if the archivematica_src_remote_locations configuration is identical across VMs, you can set archivematica_src_remote_pipeline for a second pipeline as an extra variable:

ansible-playbook am-ss.yml -t amsrc-remote-pipeline -l my_SS_in_inventory -e archivematica_src_remote_pipeline=MY_SECOND_PIPELINE_FQDN

Alternatively, use separate configuration files for each pipeline, defining archivematica_src_remote_pipeline and archivematica_src_remote_locations as needed, and load them with:

ansible-playbook am-ss.yml -e @file/pipeline2.yml -t amsrc-remote-pipeline MORE_OPTIONS_HERE

Dependencies

Role dependencies are listed in meta/main.yml.

Notes regarding role dependencies:

geerlingguy.nodejs: if having an error on task Create npm global directory, please set the nodejs_install_npm_user to {{ ansible_user_id }} as a workaround (details in https://github.com/artefactual-labs/ansible-archivematica-src/issues/151)

Please use Ansible 2.3 or newer with this role.

A number of tasks in this role use the json_query filter which needs the JMESPath library installed on the Ansible controller. It can be installed as follows:

pip install jmespath

Example Playbooks

Please note that a complete Archivematica installation includes software not installed by this role, in particular:

MySQL compatible database server (MySQL, MariaDB, Percona)
Elasticsearch
ClamAV (daemon and client)

See https://github.com/artefactual/deploy-pub/tree/master/playbooks/archivematica to find examples.

It is also recommended to take backups of your system (Archivematica and Storage Service databases, AIPS, DIPS, etc) prior to running an upgrade.

License

AGPLv3

Author Information

Artefactual Systems Inc. https://www.artefactual.com

artefactual-labs / ansible-archivematica-src

readme