MauroDataMapper-NHSD / nhsd-datadictionary-docker

A fork of MauroDataMapper/mdm-docker, customised for the NHS Data Dictionary deployment
Apache License 2.0
1 stars 0 forks source link

Mauro Data Mapper for NHS Data Dictionary

These instructions will explain how to setup and run the Mauro Data Mapper so it can be used to manage the NHS Data Dictionary. This README covers the use of Docker to build images and run the Mauro Data Mapper in containers.

Deployment

See the aws-deployment.md document in the ./doc directory of this repository for the deployment process.

Docker

Before continuing, install Docker and Docker Compose tools by following the Get Docker guide. Installing Docker Desktop is usually the quickest way to install all the necessary tools, and is best for local development/testing.

Note: Running Docker Desktop on Windows may require admin privileges.

Alternatively, you can manually install the Docker Engine and related tools by following these guides (Linux):

These steps are more useful if setting up a virtual machine for deploying Mauro Data Mapper to a target environment for test/production use.

System Requirements

Minimum:

Recommended:

These requirements account for operating system requirements too. However, the expected operations required for managing the NHS Data Dictionary are expected to be resource intensive, particular on disk I/O and memory. Therefore, you should consider the recommended requirements where possible.

The default install of Docker inside Linux configures the Docker Engine with unlimited access to the server's resources. However, if running in Windows or macOS the Docker Toolbox will need to be configured. See Mauro Data Mapper - Docker Setup for more details.

Quick Start

Follow these quick start steps to build and run the Docker containers on your local machine environment.

First, build the container images by running the following command with the current directory set to the root of this repository:

    docker compose build

This will create two images:

Once the images are ready, run the containers in detached mode as follows:

    docker compose up -d

This starts the two container images as containers as follows:

The port number for Mauro and some other parameters are set via envionment variables — these can either be set manually or via the .env.* files.

Note: After starting the mauro-data-mapper container, there is an initial load time that happens before it will respond to web traffic. View the logs in the container to know when it is ready - usually when a log message org.apache.catalina.startup.Catalina.start Server startup in [x] milliseconds appears.

Note: There is also a more convenient way to perform both (re)build images and start containers in command if you prefer:

    docker compose up -d --build

Once running, in a browser navigate to:

You can also access the Mauro backend via a HTTP request tool, like Postman or Curl, via http://localhost.

Sign in to the Mauro Data Mapper and the NHS Data Dictionary Orchestrator using the default username/password as explained in the Mauro setup instructions - though you will be prompted to change the initial password after signing in first time. You are then recommended to create further user accounts via Mauro Data Mapper.

Finally, to shutdown the applications and stop the containers, run:

    docker compose down

Updating

The Dockerfile for the mauro-data-mapper container image fetches the Mauro commits/snapshots from git. By default, the develop branch of each required repository/snapshot is used during the build, so that the latest changes can be incorporated and run.

To update the running instance:

    # Stop/shutdown any running application
    docker compose down

    # Rebuild the container images with the latest snapshots
    docker compose build

    # Start the containers again
    docker compose up -d

Volumes and Files

The following shared volumes will be created and used:

Stopping the containers will not remove these volumes, they will persist as storage which the containers connect to.

Docker top-level volumes are stored on the host under /var/lib/docker/volumes/ by default but this can be modified (along with other Docker data) by configuring the location in /etc/docker/daemon.json with the line:

 "data-root": "/some/other/path"

Log File

To view the log file generated by Mauro, it is accessible via:

Reading the log file is more useful than viewing the Docker container logs, since the container logs only show STDOUT of the process - this only contains errors and warnings. The log file, however, also shows info and debug messages.

To watch a live stream of the log file, use:

    tail -f mauro-data-mapper.log

Cleanup

Continually building docker images will leave a lot of loose snapshot images floating around. Use the following commands to cleanup dangling resources:

For when /var/lib/docker gets full there is also a docker system prune command that will remove all unused containers, networks, images (dangling and unused) and volumes. Refer to the documentation for more details: https://docs.docker.com/reference/cli/docker/system/prune/

Docker Compose

The docker-compose.yml file controls the build and setup of the containers. It is split into the two services required - the postgres and mauro-data-mapper services, which will build the images.

Run this command in a terminal to see the final docker-compose.yml file that will be used once all environment variables are interpolated in:

    docker compose config

The following sections explain the setup in more detail.

Environment Variables

The docker-compose.yml file has been written to allow environment variables to be passed into the container images for building/running correctly for your environment. By default, the .env file in the repo has set key variables to:

These are suitable for a local development build. However, you may need to adjust these for deploying to other environments - such as building to a particulat git commit/tag for a known version.

You can create multiple .env files and name them appropriately e.g. .env.live, .env.test etc. To use an .env file that is not named after the default, use the --env-file argument in the `docker compose tools, e.g.

    # .env.live file, for building a particular released version of Mauro
    MDM_APPLICATION_COMMIT=5.3.0
    MDM_UI_COMMIT=7.3.0

    MDM_PORT=80

    MDM_TAG=nhsd-1.0.0
    # Build the images on the live environment
    docker compose --env-file .env.live build

Build and Runtime Configuration

For the mauro-data-mapper service, Mauro can be configured by passing in Grails properties as environment variables, in dot-notation. For example, the Grails property in application.yml:

    database:
      host: localhost

Would be overridden by docker-compose.yml as:

    services:
      mauro-data-mapper:
        environment:
            database.host: another-host

To make it simpler, there are two files listed in the repo to control these configuration properties, found in mauro-data-mapper/config:

  1. build.yml - This is built into the service when the Docker image is being built; this is a standard Grails application.yml file.
  2. runtime.yml - This will be loaded into the container via docker-compose.yml, and is intended as the environment variable overrides.

Properties to override

The following variables need to be overriden/set when starting up a new mauro-data-mapper image. Usually this is done in the docker-compose.yml or the build.yml file.

Free space

There is a lot of space being used by docker in /ver/lib/docker — this can be release with:

    docker system prune